{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 数据框出报表：数据科学之分进合击分析策略\n",
    "![06_groupby.svg](https://pandas.pydata.org/pandas-docs/version/1.0.2/_images/06_groupby.svg)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 494 entries, 0 to 493\n",
      "Data columns (total 10 columns):\n",
      "排名              494 non-null int64\n",
      "企业名称            494 non-null object\n",
      "Company Name    494 non-null object\n",
      "估值（亿人民币）        494 non-null int64\n",
      "国家              494 non-null object\n",
      "城市              494 non-null object\n",
      "行业              494 non-null object\n",
      "掌门人/创始人         494 non-null object\n",
      "成立年份            494 non-null int64\n",
      "部分投资机构          494 non-null object\n",
      "dtypes: int64(3), object(7)\n",
      "memory usage: 38.7+ KB\n"
     ]
    }
   ],
   "source": [
    "# A0 简单读档并查看数据框讯息\n",
    "# 注意看Dtype! \n",
    "df = pd.read_csv (\"20春_pandas_week02_hurun_unicorn.tsv\", encoding = \"utf8\", sep=\"\\t\")\n",
    "df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* int数值可用作运算，object对象只能用来分类"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 挑战A1：如法泡制\n",
    "\n",
    "\n",
    "* 先国再城\n",
    "* 先行再城"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           企业名称 估值（亿人民币）              成立年份      \n",
       "             数量       总和          均值    最新    最早\n",
       "国家   行业                                         \n",
       "中国   金融科技    22    17960  816.363636  2018  2002\n",
       "     媒体和娱乐   17     8230  484.117647  2015  2003\n",
       "美国   云计算     32     6880  215.000000  2015  2000\n",
       "     共享经济     6     5670  945.000000  2017  2008\n",
       "     金融科技    21     5020  239.047619  2017  2000\n",
       "...         ...      ...         ...   ...   ...\n",
       "日本   区块链      1       70   70.000000  2014  2014\n",
       "法国   人工智能     1       70   70.000000  2016  2016\n",
       "     媒体和娱乐    1       70   70.000000  2006  2006\n",
       "爱沙尼亚 共享经济     1       70   70.000000  2013  2013\n",
       "法国   健康科技     1       70   70.000000  2013  2013\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>行业</th>\n",
       "      <th>国家</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>中国</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>中国</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>云计算</td>\n",
       "      <td>美国</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>美国</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>美国</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>房地产科技</td>\n",
       "      <td>菲律宾</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2015</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>哥伦比亚</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>游戏</td>\n",
       "      <td>印度</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2012</td>\n",
       "      <td>2012</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>消费品</td>\n",
       "      <td>芬兰</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>韩国</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2011</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           企业名称 估值（亿人民币）              成立年份      \n",
       "             数量       总和          均值    最新    最早\n",
       "行业    国家                                        \n",
       "金融科技  中国     22    17960  816.363636  2018  2002\n",
       "媒体和娱乐 中国     17     8230  484.117647  2015  2003\n",
       "云计算   美国     32     6880  215.000000  2015  2000\n",
       "共享经济  美国      6     5670  945.000000  2017  2008\n",
       "金融科技  美国     21     5020  239.047619  2017  2000\n",
       "...         ...      ...         ...   ...   ...\n",
       "房地产科技 菲律宾     1       70   70.000000  2015  2015\n",
       "物流    哥伦比亚    1       70   70.000000  2016  2016\n",
       "游戏    印度      1       70   70.000000  2012  2012\n",
       "消费品   芬兰      1       70   70.000000  2016  2016\n",
       "金融科技  韩国      1       70   70.000000  2011  2011\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# A1 原完整代码\n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "先行再国 = df.groupby ( by = ['行业', '国家'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "display(先国再行)\n",
    "display(先行再国)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "    先行再国.to_excel(writer,sheet_name=\"先行再国\") "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>城市</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>中国</td>\n",
       "      <td>北京</td>\n",
       "      <td>81</td>\n",
       "      <td>22130</td>\n",
       "      <td>273.209877</td>\n",
       "      <td>2019</td>\n",
       "      <td>2001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>美国</td>\n",
       "      <td>旧金山</td>\n",
       "      <td>55</td>\n",
       "      <td>17060</td>\n",
       "      <td>310.181818</td>\n",
       "      <td>2017</td>\n",
       "      <td>2004</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>杭州</td>\n",
       "      <td>19</td>\n",
       "      <td>13290</td>\n",
       "      <td>699.473684</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>上海</td>\n",
       "      <td>47</td>\n",
       "      <td>8990</td>\n",
       "      <td>191.276596</td>\n",
       "      <td>2017</td>\n",
       "      <td>2001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>美国</td>\n",
       "      <td>纽约</td>\n",
       "      <td>25</td>\n",
       "      <td>8640</td>\n",
       "      <td>345.600000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>哥伦比亚</td>\n",
       "      <td>波哥大</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">美国</td>\n",
       "      <td>盐湖城</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2008</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>罗利</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2011</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>印度</td>\n",
       "      <td>孟买</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2012</td>\n",
       "      <td>2012</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>美国</td>\n",
       "      <td>半月湾</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>121 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         企业名称 估值（亿人民币）              成立年份      \n",
       "           数量       总和          均值    最新    最早\n",
       "国家   城市                                       \n",
       "中国   北京    81    22130  273.209877  2019  2001\n",
       "美国   旧金山   55    17060  310.181818  2017  2004\n",
       "中国   杭州    19    13290  699.473684  2015  2000\n",
       "     上海    47     8990  191.276596  2017  2001\n",
       "美国   纽约    25     8640  345.600000  2015  2002\n",
       "...       ...      ...         ...   ...   ...\n",
       "哥伦比亚 波哥大    1       70   70.000000  2016  2016\n",
       "美国   盐湖城    1       70   70.000000  2008  2008\n",
       "     罗利     1       70   70.000000  2011  2011\n",
       "印度   孟买     1       70   70.000000  2012  2012\n",
       "美国   半月湾    1       70   70.000000  2014  2014\n",
       "\n",
       "[121 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>城市</th>\n",
       "      <th>国家</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>北京</td>\n",
       "      <td>中国</td>\n",
       "      <td>81</td>\n",
       "      <td>22130</td>\n",
       "      <td>273.209877</td>\n",
       "      <td>2019</td>\n",
       "      <td>2001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>旧金山</td>\n",
       "      <td>美国</td>\n",
       "      <td>55</td>\n",
       "      <td>17060</td>\n",
       "      <td>310.181818</td>\n",
       "      <td>2017</td>\n",
       "      <td>2004</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>杭州</td>\n",
       "      <td>中国</td>\n",
       "      <td>19</td>\n",
       "      <td>13290</td>\n",
       "      <td>699.473684</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>上海</td>\n",
       "      <td>中国</td>\n",
       "      <td>47</td>\n",
       "      <td>8990</td>\n",
       "      <td>191.276596</td>\n",
       "      <td>2017</td>\n",
       "      <td>2001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>纽约</td>\n",
       "      <td>美国</td>\n",
       "      <td>25</td>\n",
       "      <td>8640</td>\n",
       "      <td>345.600000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>塔林</td>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>普莱森顿</td>\n",
       "      <td>美国</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2012</td>\n",
       "      <td>2012</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>苗必达</td>\n",
       "      <td>美国</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2007</td>\n",
       "      <td>2007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>菲尼克斯</td>\n",
       "      <td>美国</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2015</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马德里</td>\n",
       "      <td>西班牙</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2011</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>121 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "          企业名称 估值（亿人民币）              成立年份      \n",
       "            数量       总和          均值    最新    最早\n",
       "城市   国家                                        \n",
       "北京   中国     81    22130  273.209877  2019  2001\n",
       "旧金山  美国     55    17060  310.181818  2017  2004\n",
       "杭州   中国     19    13290  699.473684  2015  2000\n",
       "上海   中国     47     8990  191.276596  2017  2001\n",
       "纽约   美国     25     8640  345.600000  2015  2002\n",
       "...        ...      ...         ...   ...   ...\n",
       "塔林   爱沙尼亚    1       70   70.000000  2013  2013\n",
       "普莱森顿 美国      1       70   70.000000  2012  2012\n",
       "苗必达  美国      1       70   70.000000  2007  2007\n",
       "菲尼克斯 美国      1       70   70.000000  2015  2015\n",
       "马德里  西班牙     1       70   70.000000  2011  2011\n",
       "\n",
       "[121 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# A1-Extra 完整代码，多来2页：先国再城，先行再城\n",
    "先国再城 = df.groupby ( by = ['国家','城市'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "先城再国 = df.groupby ( by = ['城市', '国家'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "display(先国再城)\n",
    "display(先城再国)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再城.to_excel(writer,sheet_name=\"先国再城\") \n",
    "    先城再国.to_excel(writer,sheet_name=\"先城再国\") \n",
    "# ( 來來來 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### rename\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>count</th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>min</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            企业名称 估值（亿人民币）              成立年份      \n",
       "           count      sum        mean   max   min\n",
       "国家   行业                                          \n",
       "中国   金融科技     22    17960  816.363636  2018  2002\n",
       "     媒体和娱乐    17     8230  484.117647  2015  2003\n",
       "美国   云计算      32     6880  215.000000  2015  2000\n",
       "     共享经济      6     5670  945.000000  2017  2008\n",
       "     金融科技     21     5020  239.047619  2017  2000\n",
       "...          ...      ...         ...   ...   ...\n",
       "日本   区块链       1       70   70.000000  2014  2014\n",
       "法国   人工智能      1       70   70.000000  2016  2016\n",
       "     媒体和娱乐     1       70   70.000000  2006  2006\n",
       "爱沙尼亚 共享经济      1       70   70.000000  2013  2013\n",
       "法国   健康科技      1       70   70.000000  2013  2013\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           企业名称 估值（亿人民币）              成立年份      \n",
       "             数量       总和          均值    最新    最早\n",
       "国家   行业                                         \n",
       "中国   金融科技    22    17960  816.363636  2018  2002\n",
       "     媒体和娱乐   17     8230  484.117647  2015  2003\n",
       "美国   云计算     32     6880  215.000000  2015  2000\n",
       "     共享经济     6     5670  945.000000  2017  2008\n",
       "     金融科技    21     5020  239.047619  2017  2000\n",
       "...         ...      ...         ...   ...   ...\n",
       "日本   区块链      1       70   70.000000  2014  2014\n",
       "法国   人工智能     1       70   70.000000  2016  2016\n",
       "     媒体和娱乐    1       70   70.000000  2006  2006\n",
       "爱沙尼亚 共享经济     1       70   70.000000  2013  2013\n",
       "法国   健康科技     1       70   70.000000  2013  2013\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# A2-Extra 完整代码，多来中继，说明rename\n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "            \n",
    "\n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "   \n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "# ( 來來來 )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 其中columns代表要对列名进行修改，在columns后面是一个字典形式，键是原列名，值是新列名。修改的时候只会改选择到的列。\n",
    "* rename()函数适合于修改个别的索引或者列名\n",
    "* rename能更清楚让别人知道表格表达内容"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 用agg直接改名"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>合计</th>\n",
       "      <th>平均</th>\n",
       "      <th>数量</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            合计          平均  数量\n",
       "国家  行业                        \n",
       "中国  云计算    460   92.000000   5\n",
       "    人工智能  2090  139.333333  15\n",
       "    健康科技  2060  158.461538  13\n",
       "    共享经济  4740  592.500000   8\n",
       "    区块链   1250  312.500000   4\n",
       "...        ...         ...  ..\n",
       "韩国  游戏     350  350.000000   1\n",
       "    物流     200  200.000000   1\n",
       "    电子商务   740  246.666667   3\n",
       "    金融科技    70   70.000000   1\n",
       "马耳他 区块链    150  150.000000   1\n",
       "\n",
       "[103 rows x 3 columns]"
      ]
     },
     "execution_count": 102,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg(sum=\"sum\",mean=\"mean\",count=\"count\")\n",
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg(合计=\"sum\",平均=\"mean\",数量=\"count\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### sort_values\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>count</th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>min</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            企业名称 估值（亿人民币）              成立年份      \n",
       "           count      sum        mean   max   min\n",
       "国家   行业                                          \n",
       "中国   金融科技     22    17960  816.363636  2018  2002\n",
       "     媒体和娱乐    17     8230  484.117647  2015  2003\n",
       "美国   云计算      32     6880  215.000000  2015  2000\n",
       "     共享经济      6     5670  945.000000  2017  2008\n",
       "     金融科技     21     5020  239.047619  2017  2000\n",
       "...          ...      ...         ...   ...   ...\n",
       "日本   区块链       1       70   70.000000  2014  2014\n",
       "法国   人工智能      1       70   70.000000  2016  2016\n",
       "     媒体和娱乐     1       70   70.000000  2006  2006\n",
       "爱沙尼亚 共享经济      1       70   70.000000  2013  2013\n",
       "法国   健康科技      1       70   70.000000  2013  2013\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>5</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>15</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>2016</td>\n",
       "      <td>2009</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>13</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>2019</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>8</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>4</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>1</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>2007</td>\n",
       "      <td>2007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>1</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>2011</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>3</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>2010</td>\n",
       "      <td>2005</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2011</td>\n",
       "      <td>2011</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2017</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         企业名称 估值（亿人民币）              成立年份      \n",
       "           数量       总和          均值    最新    最早\n",
       "国家  行业                                        \n",
       "中国  云计算     5      460   92.000000  2015  2011\n",
       "    人工智能   15     2090  139.333333  2016  2009\n",
       "    健康科技   13     2060  158.461538  2019  2000\n",
       "    共享经济    8     4740  592.500000  2016  2011\n",
       "    区块链     4     1250  312.500000  2017  2013\n",
       "...       ...      ...         ...   ...   ...\n",
       "韩国  游戏      1      350  350.000000  2007  2007\n",
       "    物流      1      200  200.000000  2011  2011\n",
       "    电子商务    3      740  246.666667  2010  2005\n",
       "    金融科技    1       70   70.000000  2011  2011\n",
       "马耳他 区块链     1      150  150.000000  2017  2017\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# A3-Extra 完整代码，多来中继，说明rename\n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "            \n",
    "\n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "\n",
    "    \n",
    "    \n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "       .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )     \n",
    "   \n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## pandas中的sort_values()函数原理类似于SQL中的order by，可以将数据集依照某个字段中的数据进行排序，该函数即可根据指定列数据也可根据指定行的数据排序。\n",
    "* by \t指定列名(axis=0或’index’)或索引值(axis=1或’columns’)\n",
    "* axis \t若axis=0或’index’，则按照指定列中数据大小排序；若axis=1或’columns’，则按照指定索引中数据大小排序，默认axis=0\n",
    "* ascending \t是否按指定列的数组升序排列，默认为True，即升序排列\n",
    "* inplace \t是否用排序后的数据集替换原来的数据，默认为False，即不替换\n",
    "* na_position \t{‘first’,‘last’}，设定缺失值的显示位置"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### agg的几种方法\n",
    "* df.groupby([\"国家\",\"行业\"]).agg({\"估值（亿人民币）\":[\"sum\",\"mean\",\"count\"]})\n",
    "* df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg(sum=\"sum\",mean=\"mean\",count=\"count\")\n",
    "* df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg（[\"sum\",\"mean\",\"count\"]）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### agg 参数 的报表顺序\n",
    "\n",
    "原来顺序是：\n",
    "* (    '企业名称', '数量')\n",
    "* ('估值（亿人民币）', '总和')\n",
    "* ('估值（亿人民币）', '均值')\n",
    "* (    '成立年份', '最新')\n",
    "* (    '成立年份', '最早')\n",
    "\n",
    "请改顺序为：\n",
    "* (    '成立年份', '最早')\n",
    "* (    '成立年份', '最新')\n",
    "* (    '企业名称', '数量')\n",
    "* ('估值（亿人民币）', '均值')\n",
    "* ('估值（亿人民币）', '总和')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>数量</th>\n",
       "      <th>总和</th>\n",
       "      <th>均值</th>\n",
       "      <th>最新</th>\n",
       "      <th>最早</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>22</td>\n",
       "      <td>17960</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>2018</td>\n",
       "      <td>2002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>17</td>\n",
       "      <td>8230</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>2015</td>\n",
       "      <td>2003</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>32</td>\n",
       "      <td>6880</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>2015</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>6</td>\n",
       "      <td>5670</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>2017</td>\n",
       "      <td>2008</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>21</td>\n",
       "      <td>5020</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>2017</td>\n",
       "      <td>2000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>1</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           企业名称 估值（亿人民币）              成立年份      \n",
       "             数量       总和          均值    最新    最早\n",
       "国家   行业                                         \n",
       "中国   金融科技    22    17960  816.363636  2018  2002\n",
       "     媒体和娱乐   17     8230  484.117647  2015  2003\n",
       "美国   云计算     32     6880  215.000000  2015  2000\n",
       "     共享经济     6     5670  945.000000  2017  2008\n",
       "     金融科技    21     5020  239.047619  2017  2000\n",
       "...         ...      ...         ...   ...   ...\n",
       "日本   区块链      1       70   70.000000  2014  2014\n",
       "法国   人工智能     1       70   70.000000  2016  2016\n",
       "     媒体和娱乐    1       70   70.000000  2006  2006\n",
       "爱沙尼亚 共享经济     1       70   70.000000  2013  2013\n",
       "法国   健康科技     1       70   70.000000  2013  2013\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th colspan=\"2\" halign=\"left\">成立年份</th>\n",
       "      <th>企业名称</th>\n",
       "      <th colspan=\"2\" halign=\"left\">估值（亿人民币）</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>最早</th>\n",
       "      <th>最新</th>\n",
       "      <th>数量</th>\n",
       "      <th>均值</th>\n",
       "      <th>总和</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">中国</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>2002</td>\n",
       "      <td>2018</td>\n",
       "      <td>22</td>\n",
       "      <td>816.363636</td>\n",
       "      <td>17960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>2003</td>\n",
       "      <td>2015</td>\n",
       "      <td>17</td>\n",
       "      <td>484.117647</td>\n",
       "      <td>8230</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"3\" valign=\"top\">美国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>2000</td>\n",
       "      <td>2015</td>\n",
       "      <td>32</td>\n",
       "      <td>215.000000</td>\n",
       "      <td>6880</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>2008</td>\n",
       "      <td>2017</td>\n",
       "      <td>6</td>\n",
       "      <td>945.000000</td>\n",
       "      <td>5670</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>2000</td>\n",
       "      <td>2017</td>\n",
       "      <td>21</td>\n",
       "      <td>239.047619</td>\n",
       "      <td>5020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>区块链</td>\n",
       "      <td>2014</td>\n",
       "      <td>2014</td>\n",
       "      <td>1</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"2\" valign=\"top\">法国</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>2016</td>\n",
       "      <td>2016</td>\n",
       "      <td>1</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>2006</td>\n",
       "      <td>2006</td>\n",
       "      <td>1</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "      <td>1</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>健康科技</td>\n",
       "      <td>2013</td>\n",
       "      <td>2013</td>\n",
       "      <td>1</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            成立年份       企业名称    估值（亿人民币）       \n",
       "              最早    最新   数量          均值     总和\n",
       "国家   行业                                       \n",
       "中国   金融科技   2002  2018   22  816.363636  17960\n",
       "     媒体和娱乐  2003  2015   17  484.117647   8230\n",
       "美国   云计算    2000  2015   32  215.000000   6880\n",
       "     共享经济   2008  2017    6  945.000000   5670\n",
       "     金融科技   2000  2017   21  239.047619   5020\n",
       "...          ...   ...  ...         ...    ...\n",
       "日本   区块链    2014  2014    1   70.000000     70\n",
       "法国   人工智能   2016  2016    1   70.000000     70\n",
       "     媒体和娱乐  2006  2006    1   70.000000     70\n",
       "爱沙尼亚 共享经济   2013  2013    1   70.000000     70\n",
       "法国   健康科技   2013  2013    1   70.000000     70\n",
       "\n",
       "[103 rows x 5 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# A4-Extra agg 参数 的报表顺序\n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"sum\",\"mean\"], \\\n",
    "                     \"成立年份\":[\"max\",\"min\"],               }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "    \n",
    "\n",
    "先国再行 = df.groupby ( by = ['国家','行业'] ) \\\n",
    "             .agg ({ \"成立年份\":[\"min\",\"max\"], \\\n",
    "                     \"企业名称\" : \"count\", \\\n",
    "                     \"估值（亿人民币）\":[\"mean\",\"sum\"], \n",
    "                                                       }) \\\n",
    "             .sort_values ( by = [(\"估值（亿人民币）\",\"sum\")], ascending = False) \\\n",
    "             .rename ( columns = {\"sum\":\"总和\", \"mean\":\"均值\", \"count\":\"数量\", \"max\":\"最新\", \"min\":\"最早\"} )\n",
    "display(先国再行)\n",
    "\n",
    "with pd.ExcelWriter(\"20春_pandas_week03_hurun_unicorn.xlsx\") as writer:\n",
    "    先国再行.to_excel(writer,sheet_name=\"先国再行\") \n",
    "# ( 來來來 )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th colspan=\"3\" halign=\"left\">估值（亿人民币）</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         估值（亿人民币）                  \n",
       "              sum        mean count\n",
       "国家  行业                             \n",
       "中国  云计算       460   92.000000     5\n",
       "    人工智能     2090  139.333333    15\n",
       "    健康科技     2060  158.461538    13\n",
       "    共享经济     4740  592.500000     8\n",
       "    区块链      1250  312.500000     4\n",
       "...           ...         ...   ...\n",
       "韩国  游戏        350  350.000000     1\n",
       "    物流        200  200.000000     1\n",
       "    电子商务      740  246.666667     3\n",
       "    金融科技       70   70.000000     1\n",
       "马耳他 区块链       150  150.000000     1\n",
       "\n",
       "[103 rows x 3 columns]"
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby([\"国家\",\"行业\"]).agg({\"估值（亿人民币）\":[\"sum\",\"mean\",\"count\"]})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 在agg后用一个字典，方法针对估值计算。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 小结\n",
    "1. 用 groupby 从表中取出想要的列，对数据内部进行分组处理分组时，（不仅仅可以指定一个列名，也可以指定多个列名。）\n",
    "2. 用agg在行上聚合函数，可使用下表聚合方法\n",
    "3. 用排序函数sort_values()将数据集依照某个字段中的数据进行排序，该函数即可根据指定列数据也可根据指定行的数据排序。\n",
    "4. 用rename()函数修改列名和索引\n",
    "\n",
    "#### tips\n",
    "* 先想像好报表内容样式，再开始\n",
    "  \n",
    "#### 小坑/小风格\n",
    "* [Q]代码某几行最后一个字符有 \\，指的是什麽意思？\n",
    "* [A]跟python说此行未结束我要继续写"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>方法</th>\n",
       "      <th>意义</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>count</td>\n",
       "      <td>计算分组中非NA值的数量</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>sum</td>\n",
       "      <td>计算非NA值的和</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>mean</td>\n",
       "      <td>计算非NA值的平均值</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>median</td>\n",
       "      <td>计算非NA值的算术中位数</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>std、var</td>\n",
       "      <td>计算非NA值标准差和方差</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>min、max</td>\n",
       "      <td>获得非NA值的最小和最大值</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>prod</td>\n",
       "      <td>计算非NA值的积</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>first、last</td>\n",
       "      <td>获得第一个和最后一个非NA值</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           方法              意义\n",
       "0       count    计算分组中非NA值的数量\n",
       "1         sum        计算非NA值的和\n",
       "2        mean      计算非NA值的平均值\n",
       "3      median    计算非NA值的算术中位数\n",
       "4     std、var    计算非NA值标准差和方差\n",
       "5     min、max   获得非NA值的最小和最大值\n",
       "6        prod        计算非NA值的积\n",
       "7  first、last  获得第一个和最后一个非NA值"
      ]
     },
     "execution_count": 106,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# agg方法\n",
    "框框 = pd.DataFrame ( {\n",
    "        \"方法\": [\"count\",\"sum\",\"mean\",\"median\",\"std、var\",\"min、max\",\"prod\",\"first、last\"],\n",
    "        \"意义\": [\"计算分组中非NA值的数量\", \"计算非NA值的和\", \"计算非NA值的平均值\", \"计算非NA值的算术中位数\",\"计算非NA值标准差和方差\",\"获得非NA值的最小和最大值\",\"计算非NA值的积\",\"获得第一个和最后一个非NA值\"],\n",
    "      } )\n",
    "框框"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"bg-split\"></div>\n",
    "\n",
    "## 分分分\n",
    "\n",
    "> <mark>分分分</mark>，接续上周的**切切切**切片 (英文叫slice)，groupby的分分分，是数据科学家将**切切切**的数据解剖刀，在找突破点的后，系统地把全数据拆分多块。<mark>大卸八块</mark>后好分迸合击。要如何分，不只是要会df.groupby的参数始使用，更是开展对知识领域丶统计丶及数据管理的数据形态及标准的数据感修练之旅\n",
    "\n",
    "-----\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 将对象分成组\n",
    "* 分类，作为索引\n",
    "* 我想知道中国某行业里的情况，先有想法，表格不能体现，分类是为了进一步计算"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>排名</th>\n",
       "      <th>企业名称</th>\n",
       "      <th>Company Name</th>\n",
       "      <th>估值（亿人民币）</th>\n",
       "      <th>国家</th>\n",
       "      <th>城市</th>\n",
       "      <th>行业</th>\n",
       "      <th>掌门人/创始人</th>\n",
       "      <th>成立年份</th>\n",
       "      <th>部分投资机构</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>蚂蚁金服</td>\n",
       "      <td>Ant Financial</td>\n",
       "      <td>10000</td>\n",
       "      <td>中国</td>\n",
       "      <td>杭州</td>\n",
       "      <td>金融科技</td>\n",
       "      <td>井贤栋</td>\n",
       "      <td>2014</td>\n",
       "      <td>春华资本、中投海外、红杉资本</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>字节跳动</td>\n",
       "      <td>Bytedance</td>\n",
       "      <td>5000</td>\n",
       "      <td>中国</td>\n",
       "      <td>北京</td>\n",
       "      <td>媒体和娱乐</td>\n",
       "      <td>张一鸣</td>\n",
       "      <td>2012</td>\n",
       "      <td>红杉资本、海纳亚洲、纪源资本、启明创投</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>滴滴出行</td>\n",
       "      <td>Didi Chuxing</td>\n",
       "      <td>3600</td>\n",
       "      <td>中国</td>\n",
       "      <td>北京</td>\n",
       "      <td>共享经济</td>\n",
       "      <td>程维</td>\n",
       "      <td>2012</td>\n",
       "      <td>腾讯、阿里巴巴、红杉资本、经纬中国、纪源资本</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>Infor</td>\n",
       "      <td>Infor</td>\n",
       "      <td>3500</td>\n",
       "      <td>美国</td>\n",
       "      <td>纽约</td>\n",
       "      <td>云计算</td>\n",
       "      <td>Jim Schaper</td>\n",
       "      <td>2002</td>\n",
       "      <td>Golden Gate Capital, Koch Equity Development</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>JUUL Labs</td>\n",
       "      <td>JUUL Labs</td>\n",
       "      <td>3400</td>\n",
       "      <td>美国</td>\n",
       "      <td>旧金山</td>\n",
       "      <td>消费品</td>\n",
       "      <td>Adam Bowen, James Monsees, Kevin Burns, Tim Da...</td>\n",
       "      <td>2015</td>\n",
       "      <td>M13, Timothy Davis, Evolution VC Partners, Tig...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   排名       企业名称   Company Name  估值（亿人民币）  国家   城市     行业  \\\n",
       "0   1       蚂蚁金服  Ant Financial     10000  中国   杭州   金融科技   \n",
       "1   2       字节跳动      Bytedance      5000  中国   北京  媒体和娱乐   \n",
       "2   3       滴滴出行   Didi Chuxing      3600  中国   北京   共享经济   \n",
       "3   4      Infor          Infor      3500  美国   纽约    云计算   \n",
       "4   5  JUUL Labs      JUUL Labs      3400  美国  旧金山    消费品   \n",
       "\n",
       "                                             掌门人/创始人  成立年份  \\\n",
       "0                                                井贤栋  2014   \n",
       "1                                                张一鸣  2012   \n",
       "2                                                 程维  2012   \n",
       "3                                        Jim Schaper  2002   \n",
       "4  Adam Bowen, James Monsees, Kevin Burns, Tim Da...  2015   \n",
       "\n",
       "                                              部分投资机构  \n",
       "0                                     春华资本、中投海外、红杉资本  \n",
       "1                                红杉资本、海纳亚洲、纪源资本、启明创投  \n",
       "2                             腾讯、阿里巴巴、红杉资本、经纬中国、纪源资本  \n",
       "3       Golden Gate Capital, Koch Equity Development  \n",
       "4  M13, Timothy Davis, Evolution VC Partners, Tig...  "
      ]
     },
     "execution_count": 107,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 传递给的字符串groupby可以引用列级别或索引级别\n",
    "按国家分 = df.groupby(\"国家\")\n",
    "# 并看不到结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>排名</th>\n",
       "      <th>企业名称</th>\n",
       "      <th>Company Name</th>\n",
       "      <th>估值（亿人民币）</th>\n",
       "      <th>城市</th>\n",
       "      <th>行业</th>\n",
       "      <th>掌门人/创始人</th>\n",
       "      <th>成立年份</th>\n",
       "      <th>部分投资机构</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>中国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>以色列</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>卢森堡</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>印度</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>印度尼西亚</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>哥伦比亚</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>巴西</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>德国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>新加坡</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>日本</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>法国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>澳大利亚</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱尔兰</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>爱沙尼亚</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>瑞典</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>瑞士</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>美国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>芬兰</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>英国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>菲律宾</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>西班牙</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>阿根廷</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>韩国</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          排名    企业名称 Company Name 估值（亿人民币）      城市      行业 掌门人/创始人   成立年份  \\\n",
       "国家                                                                          \n",
       "中国     int64  object       object    int64  object  object  object  int64   \n",
       "以色列    int64  object       object    int64  object  object  object  int64   \n",
       "卢森堡    int64  object       object    int64  object  object  object  int64   \n",
       "印度     int64  object       object    int64  object  object  object  int64   \n",
       "印度尼西亚  int64  object       object    int64  object  object  object  int64   \n",
       "哥伦比亚   int64  object       object    int64  object  object  object  int64   \n",
       "巴西     int64  object       object    int64  object  object  object  int64   \n",
       "德国     int64  object       object    int64  object  object  object  int64   \n",
       "新加坡    int64  object       object    int64  object  object  object  int64   \n",
       "日本     int64  object       object    int64  object  object  object  int64   \n",
       "法国     int64  object       object    int64  object  object  object  int64   \n",
       "澳大利亚   int64  object       object    int64  object  object  object  int64   \n",
       "爱尔兰    int64  object       object    int64  object  object  object  int64   \n",
       "爱沙尼亚   int64  object       object    int64  object  object  object  int64   \n",
       "瑞典     int64  object       object    int64  object  object  object  int64   \n",
       "瑞士     int64  object       object    int64  object  object  object  int64   \n",
       "美国     int64  object       object    int64  object  object  object  int64   \n",
       "芬兰     int64  object       object    int64  object  object  object  int64   \n",
       "英国     int64  object       object    int64  object  object  object  int64   \n",
       "菲律宾    int64  object       object    int64  object  object  object  int64   \n",
       "西班牙    int64  object       object    int64  object  object  object  int64   \n",
       "阿根廷    int64  object       object    int64  object  object  object  int64   \n",
       "韩国     int64  object       object    int64  object  object  object  int64   \n",
       "马耳他    int64  object       object    int64  object  object  object  int64   \n",
       "\n",
       "       部分投资机构  \n",
       "国家             \n",
       "中国     object  \n",
       "以色列    object  \n",
       "卢森堡    object  \n",
       "印度     object  \n",
       "印度尼西亚  object  \n",
       "哥伦比亚   object  \n",
       "巴西     object  \n",
       "德国     object  \n",
       "新加坡    object  \n",
       "日本     object  \n",
       "法国     object  \n",
       "澳大利亚   object  \n",
       "爱尔兰    object  \n",
       "爱沙尼亚   object  \n",
       "瑞典     object  \n",
       "瑞士     object  \n",
       "美国     object  \n",
       "芬兰     object  \n",
       "英国     object  \n",
       "菲律宾    object  \n",
       "西班牙    object  \n",
       "阿根廷    object  \n",
       "韩国     object  \n",
       "马耳他    object  "
      ]
     },
     "execution_count": 109,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "按国家分.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# 多个列作为索引\n",
    "按国家行业分 = df.groupby([\"国家\",\"行业\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>排名</th>\n",
       "      <th>企业名称</th>\n",
       "      <th>Company Name</th>\n",
       "      <th>估值（亿人民币）</th>\n",
       "      <th>城市</th>\n",
       "      <th>掌门人/创始人</th>\n",
       "      <th>成立年份</th>\n",
       "      <th>部分投资机构</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "      <td>object</td>\n",
       "      <td>int64</td>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "             排名    企业名称 Company Name 估值（亿人民币）      城市 掌门人/创始人   成立年份  部分投资机构\n",
       "国家  行业                                                                      \n",
       "中国  云计算   int64  object       object    int64  object  object  int64  object\n",
       "    人工智能  int64  object       object    int64  object  object  int64  object\n",
       "    健康科技  int64  object       object    int64  object  object  int64  object\n",
       "    共享经济  int64  object       object    int64  object  object  int64  object\n",
       "    区块链   int64  object       object    int64  object  object  int64  object\n",
       "...         ...     ...          ...      ...     ...     ...    ...     ...\n",
       "韩国  游戏    int64  object       object    int64  object  object  int64  object\n",
       "    物流    int64  object       object    int64  object  object  int64  object\n",
       "    电子商务  int64  object       object    int64  object  object  int64  object\n",
       "    金融科技  int64  object       object    int64  object  object  int64  object\n",
       "马耳他 区块链   int64  object       object    int64  object  object  int64  object\n",
       "\n",
       "[103 rows x 8 columns]"
      ]
     },
     "execution_count": 111,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "按国家行业分.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 112,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>排名</th>\n",
       "      <th>估值（亿人民币）</th>\n",
       "      <th>成立年份</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>230.800000</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>2012.400000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>189.333333</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>2013.466667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>206.538462</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>2011.384615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>148.750000</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>2014.375000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>116.500000</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>2014.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>50.000000</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>2007.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>84.000000</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>2011.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>184.333333</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>2008.333333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>264.000000</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>2011.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>138.000000</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>2017.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  排名    估值（亿人民币）         成立年份\n",
       "国家  行业                                       \n",
       "中国  云计算   230.800000   92.000000  2012.400000\n",
       "    人工智能  189.333333  139.333333  2013.466667\n",
       "    健康科技  206.538462  158.461538  2011.384615\n",
       "    共享经济  148.750000  592.500000  2014.375000\n",
       "    区块链   116.500000  312.500000  2014.000000\n",
       "...              ...         ...          ...\n",
       "韩国  游戏     50.000000  350.000000  2007.000000\n",
       "    物流     84.000000  200.000000  2011.000000\n",
       "    电子商务  184.333333  246.666667  2008.333333\n",
       "    金融科技  264.000000   70.000000  2011.000000\n",
       "马耳他 区块链   138.000000  150.000000  2017.000000\n",
       "\n",
       "[103 rows x 3 columns]"
      ]
     },
     "execution_count": 112,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "按国家行业分.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 只有int值数量的部分是可以运算的，对象object只能被分类"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### GroupBy对象属性\n",
    "\n",
    "该groups属性是一个dict，其键是计算出的唯一组，而对应的值是属于每个组的轴标签。在上面的示例中，我们有："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'中国': Int64Index([  0,   1,   2,   6,  10,  11,  12,  13,  14,  19,\n",
       "             ...\n",
       "             481, 482, 483, 484, 485, 486, 487, 488, 490, 491],\n",
       "            dtype='int64', length=206),\n",
       " '以色列': Int64Index([178, 184, 190, 310, 364, 384, 415], dtype='int64'),\n",
       " '卢森堡': Int64Index([342], dtype='int64'),\n",
       " '印度': Int64Index([ 23,  42,  45,  52,  81, 113, 123, 146, 163, 192, 206, 287, 323,\n",
       "             347, 361, 410, 422, 425, 432, 437, 465],\n",
       "            dtype='int64'),\n",
       " '印度尼西亚': Int64Index([22, 39, 74, 294], dtype='int64'),\n",
       " '哥伦比亚': Int64Index([427], dtype='int64'),\n",
       " '巴西': Int64Index([66, 344, 358, 389], dtype='int64'),\n",
       " '德国': Int64Index([56, 108, 160, 169, 269, 339, 411], dtype='int64'),\n",
       " '新加坡': Int64Index([15, 50], dtype='int64'),\n",
       " '日本': Int64Index([202, 388], dtype='int64'),\n",
       " '法国': Int64Index([149, 317, 320, 396], dtype='int64'),\n",
       " '澳大利亚': Int64Index([88], dtype='int64'),\n",
       " '爱尔兰': Int64Index([182], dtype='int64'),\n",
       " '爱沙尼亚': Int64Index([290], dtype='int64'),\n",
       " '瑞典': Int64Index([62, 196], dtype='int64'),\n",
       " '瑞士': Int64Index([36, 167, 400], dtype='int64'),\n",
       " '美国': Int64Index([  3,   4,   5,   7,   8,   9,  16,  17,  18,  21,\n",
       "             ...\n",
       "             463, 466, 469, 470, 471, 472, 474, 489, 492, 493],\n",
       "            dtype='int64', length=203),\n",
       " '芬兰': Int64Index([349], dtype='int64'),\n",
       " '英国': Int64Index([55, 73, 82, 107, 110, 145, 157, 164, 172, 177, 197, 207, 418], dtype='int64'),\n",
       " '菲律宾': Int64Index([431], dtype='int64'),\n",
       " '西班牙': Int64Index([297], dtype='int64'),\n",
       " '阿根廷': Int64Index([281], dtype='int64'),\n",
       " '韩国': Int64Index([26, 49, 129, 458, 459, 479], dtype='int64'),\n",
       " '马耳他': Int64Index([147], dtype='int64')}"
      ]
     },
     "execution_count": 113,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "按国家分.groups"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{('中国', '云计算'): Int64Index([183, 251, 325, 464, 467], dtype='int64'),\n",
       " ('中国',\n",
       "  '人工智能'): Int64Index([46, 63, 91, 102, 153, 218, 241, 255, 267, 285, 337, 401, 402, 403,\n",
       "             414],\n",
       "            dtype='int64'),\n",
       " ('中国',\n",
       "  '健康科技'): Int64Index([27, 48, 77, 234, 245, 279, 305, 326, 345, 356, 380, 398, 454], dtype='int64'),\n",
       " ('中国',\n",
       "  '共享经济'): Int64Index([2, 47, 99, 127, 224, 254, 378, 438], dtype='int64'),\n",
       " ('中国', '区块链'): Int64Index([19, 87, 150, 226], dtype='int64'),\n",
       " ('中国',\n",
       "  '大数据'): Int64Index([232, 244, 250, 321, 338, 352, 372, 385, 451], dtype='int64'),\n",
       " ('中国',\n",
       "  '媒体和娱乐'): Int64Index([1, 13, 85, 93, 95, 101, 132, 154, 214, 220, 235, 242, 260, 275,\n",
       "             387, 392, 482],\n",
       "            dtype='int64'),\n",
       " ('中国', '房地产科技'): Int64Index([24, 80, 225, 240, 259, 332, 468], dtype='int64'),\n",
       " ('中国',\n",
       "  '教育科技'): Int64Index([128, 134, 136, 229, 265, 313, 353, 354, 365, 377, 490], dtype='int64'),\n",
       " ('中国', '新能源'): Int64Index([328, 423], dtype='int64'),\n",
       " ('中国',\n",
       "  '新能源汽车'): Int64Index([78, 79, 120, 133, 152, 158, 223, 227, 249, 292, 351, 381], dtype='int64'),\n",
       " ('中国', '新零售'): Int64Index([189, 266, 299, 407], dtype='int64'),\n",
       " ('中国', '机器人'): Int64Index([14, 76, 243], dtype='int64'),\n",
       " ('中国', '消费品'): Int64Index([69, 205, 257, 442], dtype='int64'),\n",
       " ('中国', '游戏'): Int64Index([230], dtype='int64'),\n",
       " ('中国',\n",
       "  '物流'): Int64Index([11, 20, 43, 64, 105, 181, 246, 248, 278, 334, 371, 379, 390, 481,\n",
       "             484, 488],\n",
       "            dtype='int64'),\n",
       " ('中国', '生命科学'): Int64Index([100, 258, 408, 441], dtype='int64'),\n",
       " ('中国',\n",
       "  '电子商务'): Int64Index([ 25,  34, 103, 106, 122, 131, 187, 215, 231, 236, 237, 238, 239,\n",
       "             247, 253, 256, 262, 286, 291, 303, 304, 333, 363, 368, 369, 370,\n",
       "             421, 428, 477, 480, 483, 485, 491],\n",
       "            dtype='int64'),\n",
       " ('中国', '网络安全'): Int64Index([116], dtype='int64'),\n",
       " ('中国',\n",
       "  '软件与服务'): Int64Index([130, 139, 141, 228, 233, 252, 261, 314, 336, 350, 359, 393, 476,\n",
       "             478, 487],\n",
       "            dtype='int64'),\n",
       " ('中国',\n",
       "  '金融科技'): Int64Index([  0,   6,  10,  12,  35,  37,  89,  96, 199, 263, 268, 273, 274,\n",
       "             318, 327, 383, 386, 429, 439, 473, 475, 486],\n",
       "            dtype='int64'),\n",
       " ('以色列', '云计算'): Int64Index([178, 190, 310, 384], dtype='int64'),\n",
       " ('以色列', '人工智能'): Int64Index([415], dtype='int64'),\n",
       " ('以色列', '生命科学'): Int64Index([364], dtype='int64'),\n",
       " ('以色列', '软件与服务'): Int64Index([184], dtype='int64'),\n",
       " ('卢森堡', '电子商务'): Int64Index([342], dtype='int64'),\n",
       " ('印度', '共享经济'): Int64Index([45, 52, 410], dtype='int64'),\n",
       " ('印度', '即时通讯'): Int64Index([347], dtype='int64'),\n",
       " ('印度', '大数据'): Int64Index([192], dtype='int64'),\n",
       " ('印度', '教育科技'): Int64Index([42], dtype='int64'),\n",
       " ('印度', '新能源'): Int64Index([206], dtype='int64'),\n",
       " ('印度', '新零售'): Int64Index([287], dtype='int64'),\n",
       " ('印度', '游戏'): Int64Index([323], dtype='int64'),\n",
       " ('印度', '物流'): Int64Index([81, 123, 163, 432], dtype='int64'),\n",
       " ('印度', '电子商务'): Int64Index([113, 425, 437, 465], dtype='int64'),\n",
       " ('印度', '软件与服务'): Int64Index([361], dtype='int64'),\n",
       " ('印度', '金融科技'): Int64Index([23, 146, 422], dtype='int64'),\n",
       " ('印度尼西亚', '共享经济'): Int64Index([22], dtype='int64'),\n",
       " ('印度尼西亚', '电子商务'): Int64Index([39, 74, 294], dtype='int64'),\n",
       " ('哥伦比亚', '物流'): Int64Index([427], dtype='int64'),\n",
       " ('巴西', '健康科技'): Int64Index([344], dtype='int64'),\n",
       " ('巴西', '物流'): Int64Index([358, 389], dtype='int64'),\n",
       " ('巴西', '金融科技'): Int64Index([66], dtype='int64'),\n",
       " ('德国', '生命科学'): Int64Index([160], dtype='int64'),\n",
       " ('德国', '电子商务'): Int64Index([56, 169, 269, 339, 411], dtype='int64'),\n",
       " ('德国', '金融科技'): Int64Index([108], dtype='int64'),\n",
       " ('新加坡', '共享经济'): Int64Index([15], dtype='int64'),\n",
       " ('新加坡', '电子商务'): Int64Index([50], dtype='int64'),\n",
       " ('日本', '人工智能'): Int64Index([202], dtype='int64'),\n",
       " ('日本', '区块链'): Int64Index([388], dtype='int64'),\n",
       " ('法国', '人工智能'): Int64Index([396], dtype='int64'),\n",
       " ('法国', '健康科技'): Int64Index([320], dtype='int64'),\n",
       " ('法国', '共享经济'): Int64Index([149], dtype='int64'),\n",
       " ('法国', '媒体和娱乐'): Int64Index([317], dtype='int64'),\n",
       " ('澳大利亚', '云计算'): Int64Index([88], dtype='int64'),\n",
       " ('爱尔兰', '云计算'): Int64Index([182], dtype='int64'),\n",
       " ('爱沙尼亚', '共享经济'): Int64Index([290], dtype='int64'),\n",
       " ('瑞典', '新能源'): Int64Index([196], dtype='int64'),\n",
       " ('瑞典', '金融科技'): Int64Index([62], dtype='int64'),\n",
       " ('瑞士', '区块链'): Int64Index([167], dtype='int64'),\n",
       " ('瑞士', '生命科学'): Int64Index([36], dtype='int64'),\n",
       " ('瑞士', '虚拟与增强现实'): Int64Index([400], dtype='int64'),\n",
       " ('美国', '3D印刷'): Int64Index([155, 165, 335], dtype='int64'),\n",
       " ('美国',\n",
       "  '云计算'): Int64Index([  3,  70,  92, 119, 142, 170, 174, 180, 208, 209, 211, 212, 219,\n",
       "             270, 272, 282, 308, 319, 324, 329, 341, 357, 366, 367, 397, 405,\n",
       "             416, 417, 446, 460, 470, 474],\n",
       "            dtype='int64'),\n",
       " ('美国',\n",
       "  '人工智能'): Int64Index([ 33,  41,  84, 135, 138, 143, 161, 162, 179, 200, 216, 296, 307,\n",
       "             315, 436, 444, 450, 456, 463, 489],\n",
       "            dtype='int64'),\n",
       " ('美国',\n",
       "  '健康科技'): Int64Index([98, 112, 121, 166, 175, 210, 221, 277, 295, 298, 412, 424], dtype='int64'),\n",
       " ('美国', '共享经济'): Int64Index([5, 8, 40, 148, 186, 462], dtype='int64'),\n",
       " ('美国', '区块链'): Int64Index([29, 53, 90, 289], dtype='int64'),\n",
       " ('美国', '即时通讯'): Int64Index([168, 194, 362, 448, 452], dtype='int64'),\n",
       " ('美国',\n",
       "  '大数据'): Int64Index([17, 71, 94, 301, 309, 346, 394, 461], dtype='int64'),\n",
       " ('美国', '媒体和娱乐'): Int64Index([16, 117, 151, 204, 213, 471], dtype='int64'),\n",
       " ('美国', '房地产科技'): Int64Index([104, 115, 419, 443, 472], dtype='int64'),\n",
       " ('美国', '教育科技'): Int64Index([271, 312, 466], dtype='int64'),\n",
       " ('美国', '新能源'): Int64Index([302, 399, 435, 440, 469], dtype='int64'),\n",
       " ('美国', '新能源汽车'): Int64Index([54, 59, 406], dtype='int64'),\n",
       " ('美国', '新零售'): Int64Index([176, 276, 284, 300, 375, 447], dtype='int64'),\n",
       " ('美国', '机器人'): Int64Index([109], dtype='int64'),\n",
       " ('美国', '消费品'): Int64Index([4, 195, 198, 217, 343, 420, 455], dtype='int64'),\n",
       " ('美国', '游戏'): Int64Index([51, 118, 126, 140, 322], dtype='int64'),\n",
       " ('美国',\n",
       "  '物流'): Int64Index([18, 31, 97, 171, 201, 222, 311, 374, 492], dtype='int64'),\n",
       " ('美国',\n",
       "  '生命科学'): Int64Index([21, 30, 61, 68, 124, 137, 193, 264, 340, 355], dtype='int64'),\n",
       " ('美国',\n",
       "  '电子商务'): Int64Index([28, 57, 60, 67, 75, 280, 330, 331, 348, 382, 395, 409, 430, 445,\n",
       "             453, 457, 493],\n",
       "            dtype='int64'),\n",
       " ('美国', '网络安全'): Int64Index([38, 306, 360, 376, 391, 413], dtype='int64'),\n",
       " ('美国', '航天'): Int64Index([7, 111, 433], dtype='int64'),\n",
       " ('美国', '虚拟与增强现实'): Int64Index([44, 65], dtype='int64'),\n",
       " ('美国', '软件与服务'): Int64Index([203, 293, 316, 449], dtype='int64'),\n",
       " ('美国',\n",
       "  '金融科技'): Int64Index([  9,  32,  58,  72,  83,  86, 114, 125, 144, 156, 159, 173, 185,\n",
       "             188, 191, 283, 288, 373, 404, 426, 434],\n",
       "            dtype='int64'),\n",
       " ('芬兰', '消费品'): Int64Index([349], dtype='int64'),\n",
       " ('英国', '人工智能'): Int64Index([145, 172], dtype='int64'),\n",
       " ('英国', '新能源'): Int64Index([418], dtype='int64'),\n",
       " ('英国', '游戏'): Int64Index([177], dtype='int64'),\n",
       " ('英国', '物流'): Int64Index([164], dtype='int64'),\n",
       " ('英国', '生命科学'): Int64Index([197], dtype='int64'),\n",
       " ('英国', '电子商务'): Int64Index([55], dtype='int64'),\n",
       " ('英国', '金融科技'): Int64Index([73, 82, 107, 110, 157, 207], dtype='int64'),\n",
       " ('菲律宾', '房地产科技'): Int64Index([431], dtype='int64'),\n",
       " ('西班牙', '共享经济'): Int64Index([297], dtype='int64'),\n",
       " ('阿根廷', '云计算'): Int64Index([281], dtype='int64'),\n",
       " ('韩国', '游戏'): Int64Index([49], dtype='int64'),\n",
       " ('韩国', '物流'): Int64Index([129], dtype='int64'),\n",
       " ('韩国', '电子商务'): Int64Index([26, 458, 479], dtype='int64'),\n",
       " ('韩国', '金融科技'): Int64Index([459], dtype='int64'),\n",
       " ('马耳他', '区块链'): Int64Index([147], dtype='int64')}"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "按国家行业分.groups"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 多个索引内容分的是元组，元组是有顺序的，先国家再行业。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "## 进进进\n",
    "\n",
    "* 分好组对自己想要的内容进行计算 \n",
    "\n",
    "-----\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 计算选项\n",
    "使用df.info()检查, 看看Dtype非object有哪些"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 494 entries, 0 to 493\n",
      "Data columns (total 10 columns):\n",
      "排名              494 non-null int64\n",
      "企业名称            494 non-null object\n",
      "Company Name    494 non-null object\n",
      "估值（亿人民币）        494 non-null int64\n",
      "国家              494 non-null object\n",
      "城市              494 non-null object\n",
      "行业              494 non-null object\n",
      "掌门人/创始人         494 non-null object\n",
      "成立年份            494 non-null int64\n",
      "部分投资机构          494 non-null object\n",
      "dtypes: int64(3), object(7)\n",
      "memory usage: 38.7+ KB\n"
     ]
    }
   ],
   "source": [
    "#df.head()\n",
    "df.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "117970\n",
      "117970\n"
     ]
    }
   ],
   "source": [
    "\n",
    "print ( df[\"估值（亿人民币）\"].agg(\"sum\") )\n",
    "print ( df[\"估值（亿人民币）\"].sum() )\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>方法</th>\n",
       "      <th>意义</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>count</td>\n",
       "      <td>计算分组中非NA值的数量</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>sum</td>\n",
       "      <td>计算非NA值的和</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>mean</td>\n",
       "      <td>计算非NA值的平均值</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>median</td>\n",
       "      <td>计算非NA值的算术中位数</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>std、var</td>\n",
       "      <td>计算非NA值标准差和方差</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>min、max</td>\n",
       "      <td>获得非NA值的最小和最大值</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>prod</td>\n",
       "      <td>计算非NA值的积</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>first、last</td>\n",
       "      <td>获得第一个和最后一个非NA值</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           方法              意义\n",
       "0       count    计算分组中非NA值的数量\n",
       "1         sum        计算非NA值的和\n",
       "2        mean      计算非NA值的平均值\n",
       "3      median    计算非NA值的算术中位数\n",
       "4     std、var    计算非NA值标准差和方差\n",
       "5     min、max   获得非NA值的最小和最大值\n",
       "6        prod        计算非NA值的积\n",
       "7  first、last  获得第一个和最后一个非NA值"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# agg方法总结\n",
    "框框 = pd.DataFrame ( {\n",
    "        \"方法\": [\"count\",\"sum\",\"mean\",\"median\",\"std、var\",\"min、max\",\"prod\",\"first、last\"],\n",
    "        \"意义\": [\"计算分组中非NA值的数量\", \"计算非NA值的和\", \"计算非NA值的平均值\", \"计算非NA值的算术中位数\",\"计算非NA值标准差和方差\",\"获得非NA值的最小和最大值\",\"计算非NA值的积\",\"获得第一个和最后一个非NA值\"],\n",
    "      } )\n",
    "框框"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"bg-comine\"></div>\n",
    "\n",
    "## 合合合\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###  agg 的几种方式（多指标统计的方法）\n",
    "```python   \n",
    "\n",
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg(sum = \"sum\",mean = \"mean\",count = \"count\")\n",
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg([\"sum\",\"mean\",\"max\",\"min\"])\n",
    "df.groupby([\"国家\",\"行业\"]).agg({\"估值（亿人民币）\":[\"sum\",\"mean\",\"count\"]})\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           sum        mean  count\n",
       "国家  行业                           \n",
       "中国  云计算    460   92.000000      5\n",
       "    人工智能  2090  139.333333     15\n",
       "    健康科技  2060  158.461538     13\n",
       "    共享经济  4740  592.500000      8\n",
       "    区块链   1250  312.500000      4\n",
       "...        ...         ...    ...\n",
       "韩国  游戏     350  350.000000      1\n",
       "    物流     200  200.000000      1\n",
       "    电子商务   740  246.666667      3\n",
       "    金融科技    70   70.000000      1\n",
       "马耳他 区块链    150  150.000000      1\n",
       "\n",
       "[103 rows x 3 columns]"
      ]
     },
     "execution_count": 118,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg(sum = \"sum\",mean = \"mean\",count = \"count\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>max</th>\n",
       "      <th>min</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>150</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>400</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>600</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>3600</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>800</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>350</td>\n",
       "      <td>350</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>200</td>\n",
       "      <td>200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>600</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>70</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>150</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           sum        mean   max  min\n",
       "国家  行业                               \n",
       "中国  云计算    460   92.000000   150   70\n",
       "    人工智能  2090  139.333333   400   70\n",
       "    健康科技  2060  158.461538   600   70\n",
       "    共享经济  4740  592.500000  3600   70\n",
       "    区块链   1250  312.500000   800  100\n",
       "...        ...         ...   ...  ...\n",
       "韩国  游戏     350  350.000000   350  350\n",
       "    物流     200  200.000000   200  200\n",
       "    电子商务   740  246.666667   600   70\n",
       "    金融科技    70   70.000000    70   70\n",
       "马耳他 区块链    150  150.000000   150  150\n",
       "\n",
       "[103 rows x 4 columns]"
      ]
     },
     "execution_count": 119,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby([\"国家\",\"行业\"])[\"估值（亿人民币）\"].agg([\"sum\",\"mean\",\"max\",\"min\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th colspan=\"3\" halign=\"left\">估值（亿人民币）</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>sum</th>\n",
       "      <th>mean</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>国家</th>\n",
       "      <th>行业</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td rowspan=\"5\" valign=\"top\">中国</td>\n",
       "      <td>云计算</td>\n",
       "      <td>460</td>\n",
       "      <td>92.000000</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>人工智能</td>\n",
       "      <td>2090</td>\n",
       "      <td>139.333333</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>健康科技</td>\n",
       "      <td>2060</td>\n",
       "      <td>158.461538</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>共享经济</td>\n",
       "      <td>4740</td>\n",
       "      <td>592.500000</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>区块链</td>\n",
       "      <td>1250</td>\n",
       "      <td>312.500000</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td rowspan=\"4\" valign=\"top\">韩国</td>\n",
       "      <td>游戏</td>\n",
       "      <td>350</td>\n",
       "      <td>350.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>物流</td>\n",
       "      <td>200</td>\n",
       "      <td>200.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>电子商务</td>\n",
       "      <td>740</td>\n",
       "      <td>246.666667</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>金融科技</td>\n",
       "      <td>70</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>马耳他</td>\n",
       "      <td>区块链</td>\n",
       "      <td>150</td>\n",
       "      <td>150.000000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>103 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         估值（亿人民币）                  \n",
       "              sum        mean count\n",
       "国家  行业                             \n",
       "中国  云计算       460   92.000000     5\n",
       "    人工智能     2090  139.333333    15\n",
       "    健康科技     2060  158.461538    13\n",
       "    共享经济     4740  592.500000     8\n",
       "    区块链      1250  312.500000     4\n",
       "...           ...         ...   ...\n",
       "韩国  游戏        350  350.000000     1\n",
       "    物流        200  200.000000     1\n",
       "    电子商务      740  246.666667     3\n",
       "    金融科技       70   70.000000     1\n",
       "马耳他 区块链       150  150.000000     1\n",
       "\n",
       "[103 rows x 3 columns]"
      ]
     },
     "execution_count": 120,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby([\"国家\",\"行业\"]).agg({\"估值（亿人民币）\":[\"sum\",\"mean\",\"count\"]})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据感\n",
    "\n",
    "![02_split-apply-comine_detailed.png](02_split-apply-comine_detailed.png#full)\n",
    "\n",
    "1. 最初整个数据框\n",
    "2. 分细类\n",
    "3. 进行聚合计算\n",
    "\n",
    "### 什么是数据感\n",
    "* 就是对任何事情，任何看法首先第一要去找依据，而不是凭借自己的主观判断。那么依据从哪里来，就是来自于数据。\n",
    "* 也就是我们可以通过分类计算去得到我们想要的依据\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 121,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0      中国\n",
      "1      中国\n",
      "2      中国\n",
      "3      美国\n",
      "4      美国\n",
      "       ..\n",
      "489    美国\n",
      "490    中国\n",
      "491    中国\n",
      "492    美国\n",
      "493    美国\n",
      "Name: 国家, Length: 494, dtype: object\n",
      "0      中国\n",
      "1      中国\n",
      "2      中国\n",
      "3      美国\n",
      "4      美国\n",
      "       ..\n",
      "489    美国\n",
      "490    中国\n",
      "491    中国\n",
      "492    美国\n",
      "493    美国\n",
      "Name: 国家, Length: 494, dtype: category\n",
      "Categories (24, object): [中国, 以色列, 卢森堡, 印度, ..., 西班牙, 阿根廷, 韩国, 马耳他]\n"
     ]
    }
   ],
   "source": [
    "# E1 Categorical 类型的数据 转换数据类型\n",
    "print (df.国家)\n",
    "print (df.国家.astype('category'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " * dtype: object变成category，并且告诉你有24个类别"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## .boxplot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "中国         AxesSubplot(0.1,0.15;0.363636x0.75)\n",
       "美国    AxesSubplot(0.536364,0.15;0.363636x0.75)\n",
       "dtype: object"
      ]
     },
     "execution_count": 122,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEFCAYAAAAIZiutAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAaVUlEQVR4nO3df3Rc9Xnn8fdjWZZkmRJSx0pNG2RqN5UifpgoDS4K1cRJUzWFAIckjCm7Wyk45gSl2xOyJogscZo54BJITtwGR3RMKMVKDS1iXXAPa9BQ3IV27S5NfCwWumu7WTe0cUxMZGPJlp/9Y67kka6ER6MZje7V53WOjud+7517n5Gf0XO/93t/mLsjIiKSa165AxARkdlHxUFEREJUHEREJETFQUREQlQcREQkRMUhwszsITP7g3LHIVJKZvZBM/tgzvRqM2srZ0xzwfxyByDTcir4CTGzfwSGgcFJ3lsF4O4fKE1oItNjZvMBA44A3zGz1cGsPwRuMTMD5rn7cLC8cr6IVByi7SSTFIdg3qeBHwFDnnNBi5lVAb8AbC15hCKF+x3gDmAomH4FqAZ+CPwJ2cLxIPDdYL5yvohUHCLEzP4U+Cjw46DpPcDHzawjmF4KpN39y2S/KACPAkvN7BeB48Bhsl+w64HTMxW7yFS5e6+Z/Rg4l2yufhioA3rIFoaT7r4z5y3K+SJScYiWIeBud98MYGZ/DOzNmf4K4eRPku1q/1eye17bgEqye1EiUWBTbFfOF4GKQ7Tks9czfpkNQCswshf1eeBV4CvFDEyk2MxsN3AMGDk89E6y4wYXBNPzzOxeoNXdf5rzVuV8Eag4RMs84Etm9plg+j3AVTnTS4EHct/g7nfAaK/iFSABfA2omImARQrl7s2502Z2K/ATd+85y/uU80Wg4hAtlZz9sFJlzvJmZgvcfSinbQfwS8DrMxOySGHM7GvAx8ju/TuwAnjLzD4bLDIfeMXdPzP2bcr5YlBxiJbPcqaLPZGv5rw2st3qbWY2cmrfR4J/q4JlJztmK1J27n4ncCeAmb2X7ED0/wb+0t0fn+AtyvkiUnGIEHcfP54wj5wLGcfNrwR+ONk53WZWj/7/ZZYzsxpgHbAm+Pk/wKNmdi3wTWB3zimryvki0i8q2mqBBZPMmw88lrMHNV4VukJeZjEzexBoAx4BPpoz6PwpM7sO6AYuMLOV7n4Q5XxRmR72E0/BHtcJ13+wRJSZLQGOuPtkF3piZj/v7j8JXivni0jFQUREQtTFEhGREBUHEREJmdUD0osXL/b6+vpyhxFJx44do7a2ttxhRM6ePXsOu/u7yrV95XxhlO+FmyznZ3VxqK+vZ/fu3eUOI5IymQytra3lDiNyzOxgObevnC+M8r1wk+W8DiuJiEiIioOIiISoOIiISIiKg4iIhKg4iIhIiIpDzPT09NDU1MTq1atpamqip+dtb30vIjKhvE5lNbM64HF3/5CZVQJ/RfapTGl33zKdthJ8pjmrp6eHrq4u0uk0w8PDVFRU0NGRfbx0Mpksc3QiEiVn7TmY2XnAw2TvAArQCexx9yuA683snGm2SZGkUinS6TSJRIL58+eTSCRIp9OkUqlyhyYiEZNPz2EY+DTwZDDdCtwevP5boHmabX25GzOztcBagLq6OjKZTL6fZc7r7+9neHiYTCbDwMAAmUyG4eFh+vv79XucxZTz0zeS71I8Zy0O7v4mgNnoA5RqgUPB6yNA3TTbxm+vm+x92mlubnZd9Zi/hoYGKioqaG1tHb1itK+vj4aGBl09Oosp56dPV0gXXyED0gNATfB6UbCO6bRJkXR1ddHR0UFfXx+nTp2ir6+Pjo4Ourq6yh2aiERMIfdW2gO0AI8DlwAvTbNNimRk0Lmzs5P+/n4aGhpIpVIajBaRKSukODwMPG1mHwIagb8ne6io0DYpomQySTKZVDdbRKYl78M67t4a/HsQ+Cjwd8BH3H14Om1F/TQiIlIUBd2y293/FdhWrDYREZldNCAsIiIhKg4iIhKi4iAiIiEqDiIiEqLiICIiISoOIiISouIgIiIhKg4iIhKi4iAiIiEqDiIiEqLiICIiISoOIiISouIgIiIhKg4iIhKi4iAiIiEqDiIiEqLiICIiISoOIiISouIgIiIhKg4iIhKi4iAiIiEqDiIiEqLiICIiISoOIiISouIgIiIhKg4iIhKi4iAiIiEqDiIiEqLiICIiISoOIiISouIgIiIhUy4OZnaemT1tZrvN7DtBW9rMXjSzO3OWy6tNRERmn0J6DjcBj7p7M3COmf0XoMLdVwEXmtkKM7sun7aifQoRESmq+QW85ydAk5m9A/gl4CiwLZj3DNACrMyz7bXxKzeztcBagLq6OjKZTAEhysDAgH53EaGcnz7le/EVUhx2AR8HPg/0AwuAQ8G8I8BlQG2ebSHu3g10AzQ3N3tra2sBIUomk0G/u2hQzk+f8r34CjmsdBewzt2/CrwCrAFqgnmLgnUO5NkmIiKzUCF/oM8DLjKzCuCDwD1kDxEBXAIcAPbk2SYiIrNQIYeV7gYeAi4AXgS+AbxgZkuBNuBywPNsExGRWWjKPQd3/wd3f5+7L3L3j7r7m0Ar8BKQcPej+bYV60OIiEhxFdJzCHH3NzhzJtKU2kREZPbRoLCIiISoOIiISIiKg4iIhKg4iIhIiIqDiIiEqDiIiEiIioOIiISoOIiISIiKg4iIhKg4iIhIiIqDiIiEqDiIiEiIioOIiISoOIiISIiKg4iIhKg4iIhIiIqDiIiEqDiIiEiIioOIiISoOIiISIiKg4iIhKg4iIhIiIqDiIiEqDiIiEiIioOIiISoOIiISIiKg4iIhKg4iIhIiIqDiIiEqDiIiEhIwcXBzL5tZlcFr9Nm9qKZ3ZkzP682ERGZfQoqDmb2IeDd7r7dzK4DKtx9FXChma3It61on0JERIpq/lTfYGaVwIPA02b2CaAV2BbMfgZoAVbm2fbaBOtfC6wFqKurI5PJTDVEAQYGBvS7iwjl/PQp34tvysUB+A/APuCPgE7gc0A6mHcEuAyoBQ7l0Rbi7t1AN0Bzc7O3trYWEKJkMhn0u4sG5fz0Kd+Lr5DisBLodvfXzezPgV8HaoJ5i8geqhrIs01ERGahQv5A/zNwYfC6Gagne4gI4BLgALAnzzYREZmFCuk5pIEtZnYDUEl2zOG/mdlSoA24HHDghTzaRERkFppyz8Hdf+bun3T3K919lbsfJFsgXgIS7n7U3d/Mp61YH0JERIqrkJ5DiLu/wZkzkabUJiIis48GhUVEJETFQUREQlQcREQkRMVBRERCVBxERCRExUFEREJUHEREJETFQUREQlQcREQkRMVBRERCVBxipqenh6amJlavXk1TUxM9PT3lDklEIqgo91aS2aGnp4euri7S6TTDw8NUVFTQ0dEBQDKZLHN0IhIl6jnESCqVIp1Ok0gkmD9/PolEgnQ6TSqVKndoIhIxKg4x0t/fT0tLy5i2lpYW+vv7yxSRiESVikOMNDQ0sGvXrjFtu3btoqGhoUwRiUhUqTjESFdXFx0dHfT19XHq1Cn6+vro6Oigq6ur3KGJSMRoQDpGRgadOzs76e/vp6GhgVQqpcFoEZkyFYeYSSaTJJNJMpkMra2t5Q5HRCJKh5VERCRExUFEREJUHEREJETFQUREQlQcREQkRMVBRERCVBxERCRExUFEREJUHEREJETFQUREQlQcREQkRMVBRERCVBxERCSk4OJgZnVm9r+C12kze9HM7syZn1ebiIjMPtPpOXwdqDGz64AKd18FXGhmK/Jtm374IiJSCgU9z8HMPgwcA14HWoFtwaxngBZgZZ5tr02w7rXAWoC6ujoymUwhIc55AwMD+t1FhHJ++pTvxTfl4mBmC4AvA9cCvUAtcCiYfQS4bAptIe7eDXQDNDc3ux5YUxg97Cc6lPPTp3wvvkIOK90OfNvdfxpMDwA1wetFwTrzbRMRkVmokD/QHwE+Z2YZ4FLgKrKHiAAuAQ4Ae/JsExGRWWjKh5Xc/cqR10GBuBp4wcyWAm3A5YDn2SYiIrPQtA7tuHuru79JdlD6JSDh7kfzbZvOtmViPT09NDU1sXr1apqamujp6Sl3SCISQQWdrTSeu7/BmTORptQmxdPT00NXVxfpdJrh4WEqKiro6OgAIJlMljk6EYkSDQrHSCqVIp1Ok0gkmD9/PolEgnQ6TSqVKndoIiWhnnLpFKXnILNDf38/LS0tY9paWlro7+8vU0QipaOecmmp5xAjDQ0N7Nq1a0zbrl27aGhoKFNEIqWTSqVYs2YNnZ2dfOxjH6Ozs5M1a9aop1wk6jnESFdXFx0dHaN7Un19fXR0dOjLIrG0b98+jh8/Huo5HDhwoNyhxYKKQ4yMdKU7Ozvp7++noaGBVCqlLrbE0oIFC7j11ltJJBKjV0jfeuut3HHHHeUOLRZUHGImmUySTCZ1OwGJvaGhITZt2sTKlStHe8qbNm1iaGio3KHFgoqDiERSY2Mj11xzzZie8o033khvb2+5Q4sFFQcRiaSurq4Jz1bSGFtxqDiISCRpjK20VBxEJLI0xlY6us5BRERCVBxERCRExUFEREJUHEREJETFQUREQlQcREQkRMUhZnR/e5lLlO+lo+scYkT3t5e5RPleWuo5xIieBCdzifK9tFQcYkRPgpO5RPleWioOMaInwclc0tDQwIYNG8aMOWzYsEH5XiQqDjEy8iS4vr4+Tp06NfokuK6urnKHJlJ0iUSCjRs30t7ezlNPPUV7ezsbN24kkUiUO7RY0IB0jOgulTKX9PX1sX79erZs2TKa7+vXr9fzHIrE3L3cMUyqubnZd+/eXe4wIkl3qSyMme1x9+ZybV85n7+KigpOnDhBZWXlaL6fPHmS6upqhoeHyx1eZEyW8zqsJCKRpDG20lJxEJFI0hhbaWnMQUQiSWNspaWeg4iIhKjnICKRpNtnlJZ6DiISSbp9RmlNuTiY2blmtsPMnjGzJ8xsgZmlzexFM7szZ7m82kRECqHbZ5RWIT2HG4H73f03gdeBG4AKd18FXGhmK8zsunzaivUhRGTu0amspTXlMQd3/3bO5LuA3wW+GUw/A7QAK4FtebS9Nn79ZrYWWAtQV1dHJpOZaogCDAwM6HcXEcr5wlx77bXceOONfPGLX2TZsmV84xvf4N5776Wjo0O/wyIoeEDazFYB5wEHgENB8xHgMqA2z7YQd+8GuiF7taiu8i2MrpCODuV8YVpbW2lsbCSVSo2eynrfffdpMLpIChqQNrN3ApuAdmAAqAlmLQrWmW+biEjBkskke/fu5dlnn2Xv3r0qDEVUyID0AuAx4EvufhDYQ/YQEcAlZHsS+baJiMgsVMhhpQ6yh4S6zKwLeAi4ycyWAm3A5YADL+TRJiIis9CUew7u/oC7n+furcHPw0Ar8BKQcPej7v5mPm3F+hAiMjf19PSMedhPT09PuUOKjaJcIe3ub3DmTKQptYmIFEJXSJeWBoVjRntSMlfoCunS0r2VYkR7UjKX6Arp0lLPIUa0JyVzia6QLi0VhxjRnpTMJXrYT2npsFKMNDQ08KlPfYodO3YwODhIVVUVbW1t2pOSWNLDfkpLPYcYOf/88+nt7aW9vZ3t27fT3t5Ob28v559/frlDEykJXSFdOuo5xMjzzz/PFVdcwZYtW3jggQeoqqriiiuu4Pnnny93aCISMSoOMTI4OMihQ4fYsWPH6NlK7e3tDA4Oljs0EYkYHVaKETOjra1tzNlKbW1tmFm5QxORiFHPIWa6u7tZvnw5jY2N3H///XR3d5c7JBGJIBWHGGlsbKSmpobbbrsNd8fMeP/7389bb71V7tBESqKzs5MHH3xw9Oy8m2++mU2bNpU7rFhQcYiRRCLB5s2b+frXv05jYyP79u1j/fr1rFu3rtyhiRRdZ2cnmzdvZuPGjWPyHVCBKAJz93LHMKnm5mbfvXt3ucOIjKamJq655hp6e3tHz/semd67d2+5w4sEM9vj7s3l2r5yPn/V1dVcf/31vPzyy6P5fumll/L4449z4sSJcocXGZPlvIpDjFRUVHDixAkqKytHHxN68uRJqqurGR4eLnd4kaDiEB1mRn19PVu2bBlzdt6BAweYzX/XZpvJcl6HlWJEV0jLXGJmLF++fMwV0suXL+fgwYPlDi0WdCprjOgKaZlL3J2dO3dy5ZVX8uSTT3LllVeyc+dO9RqKRIeVYkTHYKdPh5Wio7q6mubmZnbv3j3aUx6ZVr7nT4eV5oDBwUG6u7tZuHDh6JjD8ePHefTRR8sdmkjRDQ0NTXhHgKGhoXKHFgsqDjFSVVXFypUree2110avc1ixYgVVVVXlDk2k6BobG1mxYgVtbW1jxthqa2vLHVosaMwhRpYsWcKrr77KqlWreOyxx1i1ahWvvvoqS5YsKXdoIkWXSCTo7e0dvXfY4OAgvb29JBKJMkcWDxpziJF58+axcOFCjh07NtpWW1vL8ePHOX36dBkjiw6NOURHdXX1hDeVrKqq0pjDFEyW8+o5xIi7c+zYMW655Ra2b9/OLbfcwrFjx3T2hsTSSGG4+uqreeKJJ7j66qvHtMv0qOcQI2ZGVVXVmC/HyPRs/n+eTdRziA4zY9myZSxcuHD07Lzjx4+zf/9+5fsUqOcwRwwODlJfX88jjzxCfX299qIk1vbv3097eztPPfUU7e3t7N+/v9whxYbOVoqZqqoqDh48yE033TRhT0Ikbu666y4GBgZYtGhRuUOJFfUcYmZwcJB169axfft21q1bp8IgsTcwMDDmXykO9RxiZvHixWzevJkHHngAM2Px4sUcPny43GGJFEW+TzXMXU7jD4VRzyHizGz0B+Dw4cOjXwZ3Hy0M45cTiSJ3H/3ZunUry5Yt47nnnuM9t/Xy3HPPsWzZMrZu3TpmOSmMeg4RNz75L774Yn7wgx+MTl900UV8//vfn+mwREoumUwC2Yf+/Mu+fjp3NJBKpUbbZXp0KmtEXLLhGY6+dbIk6z63ppJ/uus3S7LuqNGprLNDKfMdlPO5dOO9iDv61kkO3PPxvJcfufFePupvf6rAqERK43T9FzinlOsH4AdnWWpum/HiYGZpoBF4yt2/NtPbj6pzGm7noodvn9qbHs533QD5Fx6RUvtZ/z0l2xkC7RDlY0aLg5ldB1S4+yoz22JmK9z9tZmMIap+1n/PhO0HN/7OlNd1wfq/HjN9bk1lQTGJlNJEf8CLke+gnM/HjI45mNm3gL9x96fN7Aagxt0fGrfMWmAtQF1d3fu/973vzVh8caKLggqTSCRmfMxBOT99yvfCTZbzM31YqRY4FLw+Alw2fgF37wa6ITs4N5Wuopwx1W62lI9yfvqU78U309c5DAA1wetFZdi+iIjkYab/OO8BWoLXlwAHZnj7IiKSh5k+rNQLvGBmS4E24PIZ3r6IiORhRnsO7v4m0Aq8BCTc/ehMbl9ERPIz49c5uPsbwLaZ3q6IiORPA8IiIhIyq++tZGY/Bg6WO46IWgzoXt1Td4G7v6tcG1fOF0z5XrgJc35WFwcpnJntLucN5ERmkvK9+HRYSUREQlQcREQkRMUhvrrLHYDIDFK+F5nGHEREJEQ9BxERCVFxEBGREBWHaTKzs/4Ozaw653WlmU3rSSP5bHPc8jVmVjGdbb7Nun/OzC4sxbpl9lG+z518V3GYvufM7L0AZvbbZvbNCZbpNbPfMLN64PeALWZWb2a/bGZvewsTM7vAzO4b13yLmX1+Cl+ALwfbnWj9W83sf5jZznE//2pmlwTLnG9m28zsr8zsaTP7WzP7vpn9G/B/gW/nrO9uM1uRZ1wSPcr3OZLvM35vpTgxs4vJ/g7/OWg6DpwI5hlgwDJgEKgCPgl8IHh9ffDePwF+Nsn6FwL3AzePm3UTcDfwspkNAwuA9wLvc/dXJljVKbLP0pjISaB9/PvM7LvAUDD5I2BLsI7jwCeA14HvuPvpceu7h+wfg98LbrQoMaF8n1v5ruIwPXcDzwJPmNk7gJ8D3mlmlwOVZPdgPgv0AzuBLwG/CJwGzgX+0N0n/KIEPgfc5+5HRhrM7Cpgkbs/CTwZtN0LdE/yRYHsg5XOmWSek03u4+PaG8l+UQm+EH+TE8OvAScm+KLg7kfN7KvAHwAb3uazSfQo38evLMb5ruJQIDO7Efhl4H+6+9VBWyvwW+5+ezD9SbIPNdrv7qfNrJbsXhDAx4HzzrKZi9393pxt1gIpzuzhYGYfBH7F3b/4NutpJPtFfnCS+e3u/oqZXQO8x92/NcHnfRZYSHavcClw2sz+E1AN/NTdf2tkWXf/JzO77SyfTSJE+T738l3FoXBvAF8APhB0t78FvIMze1I9wAvAfyb7DAvIPiJ1efB6SR7bODVu+gayF/v8LowO1N0HJCdbgZmdG8TlZrbE3f99gsUeNLNjZPfulpjZb5Mdj/oXd/9MsMwJ4D+6+/8zs3Vk96S+GxxX/uM8YpdoU77PsXxXcSiQuz9tZiOPPF0EvJSzB9UCtLn7vuA46ohfAEaS793Afz/LZk6Z2bk5D0X6M2CY4MsCtAM73P2Hb7OOzwOPAIfIHh9tn2CZm4M9qXlAX+5eUY68r5Y0sxqyx58lJpTvk4trvutspeI4OUHbRHsSh8k+KrUX+Mc81vvnwO0jE+5+ctxxz7Vk9+AAMLOLgq74yPSvk30c62Z37wV+3sx+f6INBXuDFcBLZnZ90Ja78zBZrlSSPaac6wvAX5zls0l0Kd/HimW+qzhMz8jvrwJYY2YZM8sAm3LmGTAvOA3vKLAr+HkV4O1Oz3P358ke67x13Cwzs3cDPxo3wLcB+LVggRvInnJ3vbsPB/NvAj5tZo+YWV3Qdj7ZL9yXyR4G+Brw+2aWBDYFXW6AH3Lm2K8FMVwK/DXwdE5gHcA73X3HZJ9LIkv5PofyXYeVpqeG7Gl1lcDWcd3sq4Jlqsg+iORp4MfAV3Le/wGy/wffm2wD7t4VJOX47f47MGRmfWS7wDXAEeDvzOxXgWuA1e7+k5x1vWlmHwbuCGL6N2Af8JC7vzyynJl9guyXpxZ4LnjvZ3O2X5lt8pfN7H3unrvXuMfd05N9Hok05fscynfdeK8ILHsFaKW7jz89TiR2lO9zg4qDiIiEaMxBRERCVBxERCRExUFEREJUHEREJOT/A56R9MIENpj6AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib as mpl  \n",
    "mpl.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签  \n",
    "mpl.rcParams['axes.unicode_minus']=False #用来正常显示负号 \n",
    "\n",
    "df[df.国家.isin([\"中国\",\"美国\"])][['国家',\"估值（亿人民币）\"]].groupby ( by = '国家' ).boxplot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 数据可视化就是要分类看有没有差别"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "中国         AxesSubplot(0.1,0.15;0.363636x0.75)\n",
       "美国    AxesSubplot(0.536364,0.15;0.363636x0.75)\n",
       "dtype: object"
      ]
     },
     "execution_count": 123,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAEGCAYAAACZ0MnKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAbcUlEQVR4nO3df5BV5Z3n8feHXw2BhIEJ6UQ3SpFiE/AHFeiMUktqLirNSJzZrE5F8UeldjAIjuzsUlMF2EyMUYg4NdRYJDSBoYxmla04s2aJiDZK3wokmgmYVRnRqc0uZDRSYwKDwR+t4nf/uAe5XO/tfvpym9u3+/OqukWf5z7nOc85PPDp81sRgZmZWYoh9e6AmZk1DoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHxgAj6V5J/63e/TDrS5IuknRR0fSlki6vZ58Gi2H17oDV3HvZ50MkPQMcB7oqzNsEEBFf6JuumZ0eScMAAYeB70q6NPvqDmCxJAFDIuJ4Vt9jvsYcGgPPu1QIjey7q4FXgXei6CYdSU3Ap4AH+7yHZtW7ArgVeCebfhEYCfwL8B0KgbIJ+F72vcd8jTk0GpykvwPmAK9lRecAX5K0IJs+C9gcEX9F4R8QwAPAWZL+HfAm8BsK//D+FHj/TPXdrLci4oeSXgPGUhirlwDNwBYKgfFuRDxRNIvHfI05NBrfO8C3ImIDgKRvA/uKpr/Bh/9RzKewy/51Cr+p/QAYTuG3LrNGoF6We8zXiEOj8aX8llRa53YgB5z4reu/AP8MfKOWHTOrNUl7gDeAE4eZxlM4L3FuNj1E0l8DuYj4t6JZPeZrxKHR+IYAKyTdmE2fA/xx0fRZQHvxDBFxK3ywF/IiMBu4Exh6JjpsVq2IaCmelnQL8NuI2NLDfB7zNeLQaHzD6fnw1PCi+pI0IiLeKSrbDnwaOHRmumxWHUl3AnMp7C0EMBl4S9JNWZVhwIsRceOps3nM14pDo/HdxMld9XK+WfSzKOye/0DSiUsQL8v+bMrqVjombFZ3EbESWAkg6bMUToC/BPxDRPx9mVk85mvModHgIqL0fMUQim7aLPl+OPAvla5JlzQRjwnr5ySNAhYB12afXwIPSPpPwN8Ce4ourfWYrzFvrIFnNDCiwnfDgIeKfuMq1YSfEmD9mKRNwOXA94E5RSe7vyLpSmAjcK6kz0fEQTzma05+CdPgkf2G9nb4L90alKRPAIcjotINrEj6/Yj4bfazx3yNOTTMzCyZd8vMzCyZQ8PMzJI5NMzMLFnDXT318Y9/PCZOnFjvbjSkN954g9GjR9e7Gw1n7969v4mICfVavsd8dTzeq9fdmG+40Jg4cSJ79uypdzcaUj6fJ5fL1bsbDUfSwXou32O+Oh7v1etuzPvwlJmZJXNomJlZMoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZsoa7uc96JvX+RWR+2rE1Ko/3M8t7GgNQRJT9nLvskYrfmTUqj/czy6FhZmbJHBpmZpbMoWFmZskcGmZmlsyhYWZmyRwaZmaWzKFhZmbJHBpmZpbMoWFmZskcGmZmlsyhYWZmyXoMDUljJW2X1CHpYUkjJG2W9JSklUX1miXtKpq+XVI++7woaUWF9s+W9HJR3Qm1WTUzM6u1lD2N64C1EdEKHAKuAYZGxExgkqTJksYB9wGjT8wUEbdFRC4icsA+4P4K7V8ErDpRNyJeO431MTOzPtRjaETE+ojYkU1OAK4HfpBNdwCzgOPA1cDrpfNL+gLwckS8UmERFwM3SnpG0upe9t/MzM6g5PdpSJoJjAMOACcC4DAwPSJez+qUm/UvgNu6aXo7cAfwJvCEpAsj4rmSZS8EFgI0NzeTz+dTu20lvO0ag8d8bXi71V5SaEgaD6wDrgKWAqOyr8bQzd6KpN8DPhERv+ym+Z9GRFdW/xfAZOCU0IiIjcBGgJaWlsjlcindtlKPbcPbrjF4zNeAx3ufSDkRPgJ4CFgREQeBvRQOSQFMo7DnUcl/BB7tYRGPS/qUpI8ArRTOf5iZWT+UciJ8ATAdaJOUBwTcIGkt8BVgWzfzzgV+fGJC0iWSbimpczvQCTwNbIiIl9K7b2ZmZ1KPh6cioh1oLy6TtBWYA9wdEUeL6uZK5r22ZHonsLOkrBP4XG87bmZmZ17yifBiEXGEk1dQmZnZIOE7ws3MLJlDw8zMkjk0zMwsmUPDzMySOTTMzCyZQ8PMzJI5NMzMLJlDw8zMkjk0zMwsmUPDzMySOTTMzCyZQ8PMzJI5NMzMLJlDw8zMkjk0zMwsmUPDzMySpbwjfKyk7ZI6JD0saYSkzZKekrSyqF6zpF1F02dLellSPvtM6GYZH2rPzMz6n5Q9jeuAtRHRChwCrgGGRsRMYJKkyZLGAfcBo4vmuwhYFRG57PNaucYlXVna3umskJmZ9Z0eQyMi1kfEjmxyAnA9J1/12gHMAo4DVwOvF816MXCjpGckre5mEbky7ZmZWT+U/I5wSTOBccAB4JWs+DAwPSJez+oUz7IduAN4E3hC0oUR8VyZpkeXtldm2QuBhQDNzc3k8/nUblsJb7vG4DFfG95utZcUGpLGA+uAq4ClwKjsqzFU3lv5aUR0ZfP/ApgMlAuNYz21FxEbgY0ALS0tkcvlUrptpR7bhrddY/CYrwGP9z6RciJ8BPAQsCIiDgJ7OXkIaRqFPY9yHpf0KUkfAVqBfRXqpbZnZmZ1lrKnsYDCIaM2SW3AvcANks4CLqdw7qKc24FO4B1gQ0S8JGkqcG1EFF8l9UNgV0J7ZmZWZz2GRkS0A+3FZZK2AnOAuyPiaFHdXNHPncDnStp6AVhZUva6pFy59szMrH9JPhFeLCKOcPKKp9NW6/bMzKxv+I5wMzNL5tAwM7NkDg0zM0vm0DAzs2QODTMzS1bV1VPWP0y7vYOjb73bq3kmLt+WXHfsqOE8e1trb7tlZgOYQ6OBHX3rXQ7c9aXk+vl8vlePVehNwJjZ4ODDU2ZmlsyhYWZmyRwaZmaWzKFhZmbJHBpmZpbMoWFmZskcGmZmlsyhYWZmyXxzXwP76JTlXHDf8t7NdF9v2gdIv3nQrK/19ikIfgJC7Tk0Gtjv9t/lO8JtUOnNUxA83vtGj4enJI2VtF1Sh6SHJY2QtFnSU5JWFtVrlrSraPocSXlJOyVtlKQK7Z8t6eWsbl7ShNqsmpmZ1VrKOY3rgLUR0QocAq4BhkbETGCSpMmSxlE48DG6aL6bgMURcQnwaeCCCu1fBKyKiFz2ea3alTEzs77VY2hExPqI2JFNTgCu5+T7vDuAWcBx4Grg9aL52iJifzb5+8BvKiziYuBGSc9IWt37VTAzszMl+ZyGpJnAOOAA8EpWfBiYHhGvZ3XKzXc18E8R8esKTW8H7gDeBJ6QdGFEPFfSxkJgIUBzczP5fD612wNeb7bFsWPHer3tvK3rw2O+stRt4fHeRyKixw8wHtgDnAvcA1yclV8J3FpUL18y3yTg58DYbtpuKvp5LXBVd32ZMWNGWMG5yx7pVf3Ozs4+bX+gAvZEwr+Tvvp4zJ/UmzHp8V697sZ8yonwEcBDwIqIOAjspXBICmAahT2PcvONA7YAfxYRR7tZxOOSPiXpI0ArsK+nPpmZWX2kHJ5aAEwH2iS1AfcCN0g6C7icwjmJcpYD5wDrssNWtwFDgakR8e2iercDncA7wIaIeKmaFTEzs77XY2hERDvQXlwmaSswB7i7eC8iInJFPy8DlpVpcmdJ+53A53rVazMzq4uqbu6LiCOcvILKzMwGCT97yszMkjk0zMwsmUPDzMySOTTMzCyZQ8PMzJI5NMzMLJlDw8zMkjk0zMwsmUPDzMySOTTMzCyZ3xFuZg3jo1OWc8F9y9NnuK83bQOkvX98MHNomFnD+N3+uzhwV9p/7Pl8nlwul9z2xOXbquzV4OLDU2ZmlsyhYWZmyRwaZmaWzKFhZmbJHBpmZpasx9CQNFbSdkkdkh6WNELSZklPSVpZVK9Z0q6i6eGSfiTpJ5L+rJv2k+qZmVn9pexpXAesjYhW4BBwDTA0ImYCkyRNljSOwhXRo4vmWwLsjYj/APyppI9WaD+1npmZ1VmPoRER6yNiRzY5Abiek+8H7wBmAceBq4HXi2bNFdX7MdBSYRGp9czMrM6Sb+6TNBMYBxwAXsmKDwPTI+L1rE7xLKNL6jVXaLrHepIWAgsBmpubyefzqd0e8Hp9Q9Jj6fVHD8fbuk485itL3RbHjh3r9Xbzdu5ZUmhIGg+sA64ClgKjsq/GUHlv5VhW72hW71i19SJiI7ARoKWlJXpzl+dAdiDXu/oTl29LvpvW6stjvoLHtiXf5d3bO8J70/ZglnIifATwELAiIg4CeykckgKYRmHPo5xa1zMzszpL2dNYAEwH2iS1AfcCN0g6C7gcuLjCfPcBj0r6IjAV+JmkS4CpEfHt7upVtypmZtbXUk6Et0fEuIjIZZ/7KJy8fhqYHRFHi+rmin4+CMwBfgJcFhHHI2JnSWCUrVeD9TIzsz5Q1VNuI+IIJ6946q7er2tZz8zM6st3hJuZWTKHhpmZJXNomJlZMoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZMoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZsqrep2FmVi8Tl29Lr/xYet2xo4ZX0ZvBx6FhZg3jwF1fSq47cfm2XtW3ND0enpI0VtJ2SR2SHpY0QtJmSU9JWllU75QySYsl5bPP/5b03QrtD5P0q6K6F9Ru9czMrJZSzmlcB6yNiFbgEHANMDQiZgKTJE2WdGVpWfZu8Vz23vBdwKYK7V8IbCl6B/nzp71WZmbWJ3oMjYhYHxE7sskJwPWcfJ93BzALyJUpA0DS2UBzROypsIiLgSsk/WO2t+JDZmZm/VTyf9CSZgLjgAPAK1nxYWA6MLpM2Ql/DrR30/TPgcsi4lVJ9wPzgK0ly14ILARobm4mn8+ndttKeNs1Bo/52vB2q72k0JA0HlgHXAUsBUZlX42hsLdyrEwZkoYAs4G2bpp/LiK6sp/3AJNLK0TERmAjQEtLS+RyuZRuW6nHtuFt1xg85mvA471PpJwIHwE8BKyIiIPAXk4efppGYc+jXBnAF4GfRUR0s4jvS5omaSjwZeDZXq6DmZmdISl7GgsoHG5qk9QG3AvcIOks4HIK5yQC2FVSBjAX+PGJhiRNBa6NiJVF7X8TeBAQsDUinji9VTIzs77SY2hERDsl5yQkbQXmAHdHxNGsLFdaFhG3lrT1ArCypGwfhSuozMysn6vqSqWIOMLJq6UqlpmZ2cDiZ0+ZmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZMoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZMoeGmZklc2iYmVkyh4aZmSVzaJiZWTKHhpmZJXNomJlZsh5DQ9JYSdsldUh6WNIISZslPSVpZVG9U8okDZP0K0n57HNBN8u4XdLPJX2nNqtlZmZ9IWVP4zpgbUS0AoeAa4ChETETmCRpsqQrS8sovPd7S0Tkss/z5RqXNAOYBfwB8K+SLqvBepmZWR/oMTQiYn1E7MgmJwDXc/Jd4B0U/sPPlSm7GLhC0j9meyGV3kf+h8A/REQAjwNfrGZFzMys71X6j/xDJM0ExgEHgFey4sPAdGB0mbIngcsi4lVJ9wPzgK1lmh4N/LJo3uYyy14ILARobm4mn8+ndttKeNs1Bo/52vB2q72k0JA0HlgHXAUsBUZlX42hsLdyrEzZcxHRlZXtASZXaL7cvKeIiI3ARoCWlpbI5XIp3bZSj23D264xeMzXgMd7n0g5ET4CeAhYEREHgb0UDj8BTKOw51Gu7PuSpkkaCnwZeLbCIsrNa2Zm/VDKnsYCCoeb2iS1AfcCN0g6C7icwrmLAHaVlD0HPAgI2BoRT2R7LHdHxI1F7e8GviXpHuCPso+ZmfVDPYZGRLQD7cVlkrYCcygEwNGsLFdSdpTCFVTFbR0Gbiwpez+7YupLwD0R8f+qXhszM+tTySfCi0XEEU5eLVWxrBftvQX8fTXzmpnZmeM7ws3MLJlDw8zMkjk0zMwsmUPDzMySVXUi3Po3SZW/W1O+vPAUFzOz7nlPYwCKiLKfzs7Oit+ZmaVwaJiZWTKHhpmZJXNomJlZMoeGmZklc2iYmVkyh8YgsGXLFs4//3wuvfRSzj//fLZs2VLvLplZg/J9GgPcli1baGtrY/PmzRw/fpyhQ4eyYMECAObPn1/n3plZo/GexgC3atUqNm/ezOzZsxk2bBizZ89m8+bNrFq1qt5dM7MG5NAY4Pbv38+sWbNOKZs1axb79++vU4/MrJE5NAa4KVOmsHv37lPKdu/ezZQpU+rUIzNrZA6NAa6trY0FCxbQ2dnJe++9R2dnJwsWLKCtra3eXTOzBtTjiXBJY4H/AQwF3gCupvD616nAtoi4M6u3ubis3HwR8U6Z9ocB/zf7ACyJiOdPd8Ws4MTJ7iVLlrB//36mTJnCqlWrfBLczKqSsqdxHbA2IlqBQ8A1wNCImAlMkjRZ0pWlZWXm+6MK7V8IbImIXPZxYNTY/Pnz2bdvH08++ST79u1zYJhZ1Xrc04iI9UWTE4Drgb/NpjuAWcDnOfl+8A5gVpn5/rXCIi4GrpA0G3geuCki3kteAzMzO2OS79OQNBMYBxwAXsmKDwPTgdFlyk6ZLyKertD0z4HLIuJVSfcD84CtJcteCCwEaG5uJp/Pp3bbihw7dszbrkF4zNeGt1vtJYWGpPHAOuAqYCkwKvtqDIVDXMfKlJXOV8lzEdGV/bwHmFxaISI2AhsBWlpaIpfLpXTbSuTzebztGoPHfA08ts3jvQ/0eE5D0gjgIWBFRBwE9lI4JAUwjcKex4fKysxXyfclTZM0FPgy8Gw1K2KV+TEiZlYrKXsaCygcbmqT1AbcC9wg6SzgcgrnJALYVVJWOl87hXMW10bEyqL2vwk8CAjYGhFP1GTNDPBjRMysxiq9/rO7D4VzG18BPtldWV98ZsyYEZbuvPPOi507d0ZERGdnZ0RE7Ny5M84777w69qqxAHuiD8d0Tx+P+eqcu+yRenehYXU35qt6YGFEHOHk1VIVy6z+/BgRM6sl3xE+wPkxImZWSw6NAc6PETGzWvL7NAY4P0bEzGrJoTEIzJ8/n/nz5/s+DTM7bT48ZWZmyRwaZmaWzKExCCxZsoSRI0cye/ZsRo4cyZIlS+rdJTNrUD6nMcAtWbKEDRs2sGbNGqZOncoLL7zAsmXLAFi3bl2de2dmjcZ7GgPcpk2bWLNmDUuXLmXkyJEsXbqUNWvWsGnTpnp3zcwakENjgOvq6mLRokWnlC1atIiurq4Kc5g1FkllPwfXXFHxO6ueQ2OAa2pqYsOGDaeUbdiwgaampjr1yKy2Kj0jqbOzs7vn51mVfE5jgPva1772wTmMqVOnsnbtWpYtW/ahvQ8zsxQOjQHuxMnuW2+9la6uLpqamli0aJFPgptZVXx4ahBYt24db7/9Np2dnbz99tsODDOrmkPDzMySOTTMzCxZyjvCx0raLqlD0sOSRkjaLOkpSSuL6iWVVVhGUj2rju8It8Fk7ty5DBkyhNmzZzNkyBDmzp1b7y4NKCl7GtcBayOiFTgEXAMMjYiZwCRJkyVdmVJWrvHUeladE3eEr169mu3bt7N69Wo2bNjg4LABae7cuXR0dLBo0SJ+9KMfsWjRIjo6OhwcNdRjaETE+ojYkU1OAK7n5GtdO4BZQC6xrJzUelYF3xFug8mOHTtYvHgx69evZ8yYMaxfv57FixezY8eOnme2JMmX3EqaCYwDDgCvZMWHgenA6MSycnqsJ2khsBCgubmZfD6f2u1Br6uri6lTp5LP5zl27Bj5fJ6pU6fS1dXl7diPecxXJyKYN2/eKeN93rx5tLe3exvWSFJoSBoPrAOuApYCo7KvxlDYWzmWWFZOj/UiYiOwEaClpSX8IqF0TU1NvPDCCyxduvSDlzCtXbuWpqYmv5CpH/OYr44kHn30UdavX//BeL/55puR5PFeIz2GhqQRwEPAiog4KGkvhUNITwPTgJeAlxPLyinXntWI7wi3wWTOnDm0t7cDMG/ePG6++Wba29tpbW2tc88GkErPZil6Rsti4AiQzz5fBZ4F1gL7gbHAxxLLpgJ3lrT/oXrd9WfGjBlhvXPLLbdEU1NTANHU1BS33HJLvbvUUIA90cO/k778eMz3Tmtra0gKICRFa2trvbvUcLob84oqHt4laRwwB/hxRBzqTVlqe5W0tLTEnj17et1nw+8Ir5KkvRHRUq/le8xXx+O9et2N+aqePRURRzh5xVOvylLbMzOz/sd3hJuZWTKHhpmZJXNomJlZMoeGmZklq+rqqXqS9BpwsN79aFAfB35T7040oHMjYkK9Fu4xXzWP9+pVHPMNFxpWPUl76nnpqNmZ5PHeN3x4yszMkjk0zMwsmUNjcNlY7w6YnUEe733A5zTMzCyZ9zTMzCyZQ2OQkdQk6TP17ofZmeIxX1sOjQYj6R5J0yXdJ+n3ynz/v7KnBldyLYUXavW0nG9JapE0RNIfSvqMpBtPp+9m1fCY718cGo1nZPbn94CvqmC4JGXlPwAuOFFZ0vCinycAdwJjJD0iaW/256OSOovqjaTwYqxfADMpvBf+V8BX+nC9zCrxmO9HfCK8n5P0PeAzwBtZ0WcovEv934Am4BpgOTAFKP3LFPDriPjP2T+kbcDOiLgra/vpiLi4zDL/ApgQESsl/XdgTUQ8L2kdsCUiflrr9TQ7wWO+f6vqfRp2Rr0PfC0iXgSQ9OcUXqW7D5iWvbTqv0qaHRHFvzkNA26LiL/Kis6m8ErdT0p6JCv7rKRHKYyDrRHxbUmfBv4S2CxpNvB+RDyf1f8G8D8lXRERv+vLlbZBzWO+H3No9H8BPJDtij8DPAJMAv4A+D9F9VZIejcidmfTXwA+90EjEQeAr0vaCbRGxHvZb13zSpb3ReCvKfx29y3gFUlHKLyS99PAP1E4Rvzd2q6m2Qc85vsxh0b/Nwq4EmjO/nwWuBkYAXy9qN4aCrvsV2TTf0LhGDAAkoZmP5Y9HilpCIXDlQ9KmgWMBy4B3qHwG9k8SSsovDt4R21Wzawsj/l+zCfC+79zgN8CYykc1z0ATANWF1eKiCeBc7MrPkRh8D9eVGUhheO7bwE/zHbXP5udFHwk+25BSZtvAp+ncFgACseT36zp2pl9mMd8P+Y9jX5M0seAj1G4quNu4Dbg74BdwB8Dn8jqbIiI94E7gDFAK5DPdsdP/DbVDrSXtP+ziLiCDxsCDJE0gsIx3b/MyicAR2u7lmYnecz3fw6N/m0xcD/wE+Am4DvA30TEA5IeAO4BvgrslvR+6cySdlO4muR7wKYy7Y+psNwmCocC/gZ4MCL2Z1e0jAP++XRWyKwHHvP9nC+57cdOXG8eEe9mu9+fjIhXi77/9xHRZwNaksIDxM4gj/n+z6FhZmbJfCLczMySOTTMzCyZQ8PMzJI5NMzMLJlDw8zMkv1/GXM7FIzYu9kAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df[df.国家.isin([\"中国\",\"美国\"])][['国家',\"成立年份\"]].groupby ( by = '国家' ).boxplot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 把中位数、四分位数最大值和最小值做成箱盒图"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 本周我的总结"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 本周我学习了数据框分併合\n",
    "* groupby和agg的使用方法和意义\n",
    "* rename和sort_values的意义\n",
    "* boxplot的用法和看法（怎么看）\n",
    "* 了解了分组的意义\n",
    "> 分组统计目标是分组例子和统计描述的这个量度是不是有差的，分进合击有没有差，差在哪？"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "281.390625px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
